THE FUNCTION OF N-GRAMSSUBSTITUTION AND CODE SHIFT IN THE SOUNDEX ALGORITHM
View/ Open
Date
2006-08Author
Nurdiati, Sri
Adisantoso, Julio
Herdiyeni, Yeni
Saputra, R Zainal Arifin Fandi
Metadata
Show full item recordAbstract
The unclearness of word-root due to user limitation in the ihformation of scientific names as well as the characteristic of scientific name retrieval system reduces the performance of the system. The objective of the research is to analyze the effect of ngrams substitution and code shift to increase the recall and precision value of the Soundex algorithm. For that intention, the following steps are conducted: make the scientific names dictionary, identify the scientific names in document, and rank the process by using a dice coefficient. The testing process uses 849 document collection and 20 kinds of query with different kinds of mistake. The performance of the retrieval is compared between using and not using ngrams substitution and a code shift, only using ngrams substitution (NS), and using both n-grams substitution and a code shift (CS). The result of the research shows that the use of n-grams substitution and code shift is able to increase the performance of the scientific name retrieval system. Both techniques can retrieve up to 95% scientific names with 20 different queries. The result of the research also shows that data would not affect the language when n-grams substitution and code shift are used. This is because n-grams substitution makes the change of the sound more uniform as a result of the match between two or more alphabets into one or more alphabets
Collections
- Proceedings [2790]