Kajian Metode Penggerombolan Dua Tahap untuk Data yang Mengandung Pencilan

Nurwida, Arni

View/Open

full text (1.643Mb)

Date

2014

Author

Nurwida, Arni

Sadik, Kusman

Indahwati

Metadata

Show full item record

Abstract

Cluster analysis is often encountered in various studies. Analysis of classical clusters, such as hierarchical clustering method and k-means clustering cannot handle categorical variables or a mixture of numerical and categorical. In addition, the determination of the optimal number of clusters are still dependent on the subjectivity of the researcher and cannot handle very large datasets, which is larger than 500. One approach to addressing this problem is to use a two-step clustering method. The accuracy of the two-step clustering method of predicting the number of clusters generated as well as the classification of cluster membership, especially in the data containing outliers is important to be studied. Outliers in the data containing a small (1%), this method provides more accurate compared with the results of data containing a large outliers (5% or 15%). Scale use of outliers handling in the data containing outliers must be greater than the amount of outliers itself. Two-step clustering method is very accurate in producing a number of clusters associated with the actual number of population clusters that do not contain data outliers, especially in the most variable of type numeric and categorical rest. Clustering villages in Indonesia by a factor of progress and backwardness villages using a two-step clustering method generates optimal cluster 7.

URI

http://repository.ipb.ac.id/handle/123456789/69118

Collections

UT - Statistics and Data Sciences [2260]