Pengembangan stemmer berbasis kamus besar bahasa Indonesia

Iqbal, Raden

View/Open

Fulltext (451.5Kb)

Date

2010

Author

Iqbal, Raden

Adisantoso, Julio

Wijaya, Sony Hartono

Metadata

Show full item record

Abstract

Although many studies have aimed to develop a stemmer for Indonesian language, there are still some issues that these studies have not been able to overcame, for example an issue concerning morphophonemic. Kamus Besar Bahasa Indonesia-Based Stemmer is a stemmer which can be used as one of the main components in an information retrieval system for documents written in Indonesian. The use of Kamus Besar Bahasa Indonesia (KBBI) as a base for stemming allows the stemmer to produce the appropriate stem according to Indonesian rule of forming derived words. All words contained in KBBI are compiled in a database and used as a reference in the stemming process. For every word that is going to be stemmed, a group of words that may be present in KBBI is formed. This group of words is formed by removing parts of word that may be known as affixes in Indonesian language. This research also conducts an analysis of Ridha Stemmer, which has been widely used by many information retrieval researches at IPB’s Computer Science Department. The analysis is limited only on the correctness of the produced stems. Both stemmers are evaluated by measuring the recall-precision values returned by the use of each stemmer in an information retrieval system. The analysis shows that Ridha Stemmer yields correct stems at 20.15% of total tokens, and 25.56% wrong stems. From the result of performance evaluation, it is obtained that the use of KBBI-Based Stemmer gives slightly higher average precision value than Ridha Stemmer, with values each 0.441 and 0.443. Keywords:

URI

http://repository.ipb.ac.id/handle/123456789/125737

Collections

UT - Computer Science [2482]