Pengembangan stemmer berbasis kamus besar bahasa Indonesia
View/ Open
Date
2010Author
Iqbal, Raden
Adisantoso, Julio
Wijaya, Sony Hartono
Metadata
Show full item recordAbstract
Although many studies have aimed to develop a stemmer for Indonesian language, there are
still some issues that these studies have not been able to overcame, for example an issue
concerning morphophonemic. Kamus Besar Bahasa Indonesia-Based Stemmer is a stemmer
which can be used as one of the main components in an information retrieval system for
documents written in Indonesian. The use of Kamus Besar Bahasa Indonesia (KBBI) as a base for
stemming allows the stemmer to produce the appropriate stem according to Indonesian rule of
forming derived words. All words contained in KBBI are compiled in a database and used as a
reference in the stemming process. For every word that is going to be stemmed, a group of words
that may be present in KBBI is formed. This group of words is formed by removing parts of word
that may be known as affixes in Indonesian language. This research also conducts an analysis of
Ridha Stemmer, which has been widely used by many information retrieval researches at IPB’s
Computer Science Department. The analysis is limited only on the correctness of the produced
stems. Both stemmers are evaluated by measuring the recall-precision values returned by the use
of each stemmer in an information retrieval system. The analysis shows that Ridha Stemmer yields
correct stems at 20.15% of total tokens, and 25.56% wrong stems. From the result of performance
evaluation, it is obtained that the use of KBBI-Based Stemmer gives slightly higher average
precision value than Ridha Stemmer, with values each 0.441 and 0.443.
Keywords:
Collections
- UT - Computer Science [2327]