Studi Komparatif Pembobotan Kata untuk Temu Kembali Informasi Dokumen Bahasa Indonesia
View/ Open
Date
2013Author
Anugrah, Hafizhia Dhikrul
Adisantoso, Julio
Metadata
Show full item recordAbstract
One of the classical models of Information Retrieval (IR) systems is the vector space model. Vector in this model represents the weight of terms contained in the documents and queries. A term can be a word, a phrase, or a unit in a document describing the context of the document. Since each term has a different level of importance in the document, weighting is needed. The commonly used weighting method is TF-IDF (Term Frequency Inverse Document Frequency). Previous research indicates that the distribution of term weighting follow the Poisson distribution, hence a method called RIDF (Residual Inverse Document Frequency) was developed. Other weighting methods by considering the query is called Query Term weighting. Both of the latter methods has not been implemented for documents in Indonesian. This research implements the methods TF-IDF, RIDF and Query Term Weighting on search engine for documents in Indonesian. The result of this research is a search engine with an average precision of 63,9%.
Collections
- UT - Computer Science [2322]