Please use this identifier to cite or link to this item:
http://repository.ipb.ac.id/handle/123456789/71870
Title: | Klasifikasi imbalanced data menggunakan weighted k-nearest neighbor pada data debitur kartu kredit bank |
Other Titles: | Classification of imbalanced data using weighted k-nearest neighbor in data bank credit card debtors |
Authors: | Kustiyo, Aziz Syahidah, Aisyah |
Issue Date: | 2014 |
Publisher: | Bogor Agricultural University (IPB) |
Abstract: | Manajemen risiko kredit bertujuan untuk meminimalkan potensi kerugian dari kredit macet. Analisis data debitur bermasalah yang sudah ada dapat menjadi model dalam kualifikasi pemberian kredit selanjutnya. Data debitur bank termasuk kasus data tidak seimbang. Proses klasifikasi menjadi tidak optimal karena kelas dengan jumlah data lebih banyak memberikan pengaruh yang sangat besar dalam hasil klasifikasi. Penelitian ini bertujuan untuk mengembangkan model klasifikasi data debitur kartu kredit menggunakan algoritme weighted k-nearest neighbor dan metode sampling yang bertujuan meningkatkan kualitas klasifikasi pada data tidak seimbang. Metode sampling yang digunakan yaitu oversampling dan undersampling. Metode oversampling acak menghasilkan nilai f-measure terbaik sebesar 86.51%. Metode oversampling duplikasi menghasilkan nilai recall terbaik sebesar 100%. Credit risk management aims to minimize potential losses of non-performing loans. The classification results of existing data debtors can be referred for credit qualifications. The debtors data, most likely, are imbalanced due to the good debtors dominated the bad one. Classification process could not be optimum because of the class with more data had tremendous influence in the classification result. This research aims to develop a data classification model based on credit card debtors using weighted k-nearest neighbor and sampling method which aimed to improve the quality of classification on the imbalanced data. The sampling methods used are the oversampling and undersampling. The random oversampling method obtains the best performance with F-measure of 86.51%. Moreover, the duplication oversampling can obtain 100% recall. Keywords: imbalanced data, oversampling, undersampling, weighted k-nearest neighbor |
URI: | http://repository.ipb.ac.id/handle/123456789/71870 |
Appears in Collections: | UT - Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
G14asy.pdf Restricted Access | Full Text | 550.93 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.