Please use this identifier to cite or link to this item:
Authors: Nurdiati, Sri
Adisantoso, Julio
Herdiyeni, Yeni
Saputra, R Zainal Arifin Fandi
Keywords: Information retrieval
scientific name,
Fuzzy Soundex
code shift,
n-grams substitution,
and d ice coefficient.
Issue Date: Aug-2006
Publisher: The 2nd International Seminar information and comunication Technology Seminar ICTS
Abstract: The unclearness of word-root due to user limitation in the ihformation of scientific names as well as the characteristic of scientific name retrieval system reduces the performance of the system. The objective of the research is to analyze the effect of ngrams substitution and code shift to increase the recall and precision value of the Soundex algorithm. For that intention, the following steps are conducted: make the scientific names dictionary, identify the scientific names in document, and rank the process by using a dice coefficient. The testing process uses 849 document collection and 20 kinds of query with different kinds of mistake. The performance of the retrieval is compared between using and not using ngrams substitution and a code shift, only using ngrams substitution (NS), and using both n-grams substitution and a code shift (CS). The result of the research shows that the use of n-grams substitution and code shift is able to increase the performance of the scientific name retrieval system. Both techniques can retrieve up to 95% scientific names with 20 different queries. The result of the research also shows that data would not affect the language when n-grams substitution and code shift are used. This is because n-grams substitution makes the change of the sound more uniform as a result of the match between two or more alphabets into one or more alphabets
ISSN: 1858-1633
Appears in Collections:Proceedings

Files in This Item:
File Description SizeFormat 
karil JAS 1.pdf3.76 MBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.