Klasifikasi Dokumen Berita Menggunakan Metode Support Vector Machine dengan Kernel Radial Basis Function
Abstract
Every day the number of text documents, especially online news documents increase. As a result, it is more difficult for information seekers in obtaining the desired information. This problem requires a text processing technique that is able to automatically classify text documents based on predetermined categories. This technique is document classification. A very good and popular classification method is support vector machine (SVM). SVM tries to find the best hyperplane which separates 2 classes of data in a vector space. By applying kernel trick, SVM can implement classification in non-linear case. The goals of this research are applying the radial basis function kernel of SVM to classify Reuters-21578 news documents, and comparing the weighting method term frequency (tf) and term frequency-inverse document frequency (tf-idf). The research uses chi-square in feature selection, producing 1716 features out of 7279 terms from tokenization and stopwords removal. The final result shows that the SVM classification produces an accuracy of 93.21% using tf weighting and 92.97% using tf-idf weighting
Collections
- UT - Computer Science [2322]