Sistem Pengklasifikasian Entitas pada Pesan Twitter Menggunakan Ekspresi Regular dan Naïve Bayes
Abstract
The widespread use of social networks like Twitter encourages some institutions to find out the entities that are discussed and the sentiment toward the entities. This research aims to build a system that can extract entities in tweets and classify the sentiment from the tweets. Entity extraction process is done using regular expressions. Writing regular expressions can capture the diverse entities. Sentiment classification is performed using naïve Bayes classification with Multinomial and Bernoulli models. Sentiment in this research consists of three classes, namely positive, negative, and neutral. Before entering the classification, tweets are normalized by replacing the non-standard words with the standard ones. The test results show that regular expression is quite effective for the entity extraction process. As for the results of the classification, naïve Bayes method produces high accuracy (96.75% for Multinomial and 96.33% for Bernoulli). However, the accuracy obtained depends on the neutral class which amounts to 97.88% for Multinomial and 99.95% for Bernoulli, while the positive class are 85.66% and 33.13% and negative class are 33.67% and 0%.
Collections
- UT - Computer Science [2482]
