Pemilihan Fitur Dokumen Bahasa Indonesia untuk Pengelompokan dengan Metode K-Means

Dewi, Rahmatika

dc.contributor.advisor	Adisantoso, Julio
dc.contributor.author	Dewi, Rahmatika
dc.date.accessioned	2013-09-09T02:43:26Z
dc.date.available	2013-09-09T02:43:26Z
dc.date.issued	2013
dc.identifier.uri	http://repository.ipb.ac.id/handle/123456789/65299
dc.description.abstract	The field of document information retrieval has very diverse and rapidly-growing documents thereforethe need for methods to categorize documents effectively and efficiently increases. Categorizing documents can be performed using clustering techniques. This research uses the K-Means technique, one example of a partitioning clustering algorithm. K-Means is a simple algorithm that aims to get the appropriate grouping. Chi-square feature selection and the IDF were used to obtain the termsused as the unique identifiers of the documents. Clustering results with different feature selection techniques were made forcomparison to get the expected results.The accuracy values obtained for the IDF and the chi-square feature selection for data size 150 using rand index are26%, 75%, respectively.The accuracy values obtained for the IDF and the chi-square feature selection for data size 457 using rand index are31%, 37%, respectively. The accuracy values obtained for the IDF and the chi-square feature selection for data size 150 usingpurity measureare 97%, 96%, respectively. The accuracy values obtained for the IDF and the chi-square feature selection for data size 457 using rand index are 93%, 95%, respectively.	en
dc.subject	Bogor Agricultural University (IPB)	en
dc.subject	Feature Selection	en
dc.subject	Clustering	en
dc.subject	K-Means,	en
dc.title	Pemilihan Fitur Dokumen Bahasa Indonesia untuk Pengelompokan dengan Metode K-Means	en

Files in this item

Name:: G13rde.pdf
Size:: 553.2Kb
Format:: PDF
Description:: full text

View/Open

This item appears in the following Collection(s)

UT - Computer Science [2482]

Show simple item record