Perbandingan Oversampling Duplikasi Terhadap Oversampling Acak pada Algoritme K-Nearest Neighbour untuk Kasus Imbalanced Data
Abstract
Imbalanced class can give negative effect, especially the tendency of the data classes becomes imbalanced. It causes the data will be more inclined to the majority class composition and ignore the minority class. But, minority class sometimes has important information even more difficult to predict than the majority class. In addition, it can also decrease the classifier performance of imbalanced class. The solution will be done by modifying the dataset using duplication oversampling and random oversampling. In this study, a comparison will be made between the random oversampling and duplication oversampling. In this study, we use k-nearest neighbour as the clasifier. The results show that duplication oversampling has better performance than random oversampling, but random oversampling. However, the f-measure of random oversampling is slightly different compared to that of the duplication oversampling
Collections
- UT - Computer Science [2330]