Two Statistics to Get Fuzzy Clusters Suited for Data
スポンサーリンク
概要
- 論文の詳細を見る
Two statistics, termed rFGC(revised Fuzzy Grouping Criterion)and rFGC^*, to get fuzzy clusters suited for data in the fuzzy c-means method are presented. These are revisions of FGC by making use of the ratio of the attainable within-clusters variance to the total variance even if the population has no clusters. The rFGC is a statistic to estimate the number of clusters and the rFGC^* is one to estimate the degree of fuzziness of clusters. The performance of two statistics was examined by a simulation study in several dimensional normal population. The rFGC could detect no clusters, though the pseudo-F and FGC failed. The rFGC, pseudo-F and FGC could estimate the true number of clusters if data had well separated clusters. The rFGC had a 'conservative' property, though the pseudo-F and FGC had an'overfitting' property. The goodness of the degree of fuzziness was assessed by the mean squared error(MSE)of estimated cluster centers. The non-fuzzy clustering gave relatively large MSEs if population had overlapping clusters. The fuzzy clustering with fixed m=2 gave relatively large MSEs if population had well separated clusters or if data dimension was large. Wherase, the clustering by rFGC^* gave relatively small MSEs in all cases. It is found objectively that the fuzzy clustering with an appropriate degree-of fuzziness, which can be obtained by rFGC^*, is better than both the non-fuzzy one and the fuzzy one with fixed m=2 from the viewpoint of MSE.
- 日本知能情報ファジィ学会の論文
- 1998-08-15