View Item 
      •   IPB Repository
      • Dissertations and Theses
      • Undergraduate Theses
      • UT - School of Data Science, Mathematic and Informatics
      • UT - Statistics and Data Sciences
      • View Item
      •   IPB Repository
      • Dissertations and Theses
      • Undergraduate Theses
      • UT - School of Data Science, Mathematic and Informatics
      • UT - Statistics and Data Sciences
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Kajian Penerapan Metode Synthetic Oversampling dan Ensemble Classification pada Class Imbalanced Dataset

      Thumbnail
      View/Open
      Cover (625.3Kb)
      Fulltext (1.159Mb)
      Lampiran (645.8Kb)
      Date
      2025
      Author
      Fitri, Zafira Ilma
      Wijayanto, Hari
      Angraini, Yenni
      Metadata
      Show full item record
      Abstract
      Permasalahan class imbalanced dataset pada klasifikasi menimbulkan tantangan serius karena algoritma cenderung lebih fokus pada kelas mayoritas dan mengabaikan kelas minoritas. Penelitian ini dilakukan untuk mengkaji penerapan teknik synthetic oversampling dan algoritma ensemble classification Random Forest serta XGBoost pada data tidak seimbang, serta mengeksplorasi peran struktur dan skala peubah terhadap performa model. Data yang digunakan merupakan data risiko kredit dengan 32.581 observasi, yang dikontruksi menjadi tiga tipe, yaitu data campuran numerik–kategorik (Data 1), data murni kategorik(Data 2), dan data murni numerik (Data 3). Analisis meliputi tahap preprocessing, pembagian data secara stratified, penerapan synthetic oversampling (SMOTE, Borderline-SMOTE, ADASYN), pembangunan model dengan Random Forest dan XGBoost, serta evaluasi menggunakan metrik balanced accuracy, precision, recall, F1-score. Hasil penelitian menunjukkan bahwa synthetic oversampling dapat memperbaiki representasi kelas minoritas, tetapi efektivitasnya sangat bergantung pada karakteristik data dan algoritma yang digunakan. Random Forest cenderung lebih stabil pada data kategorik, sedangkan XGBoost lebih unggul pada data numerik dan campuran dengan bantuan balancing internal. Oleh karena itu, pemilihan strategi balancing dan algoritma perlu disesuaikan dengan struktur dataagar diperoleh hasil klasifikasi optimal pada kondisi class imbalanced dataset.
       
      The problem of class imbalanced datasets in classification poses a serious challenge because algorithms tend to focus more on the majority class and ignore the minority class. This study aims to examine the application of synthetic oversampling techniques and ensemble classification algorithms, namely Random Forest and XGBoost, on imbalanced data, as well as to explore the role of variable structure and scale on model performance. The dataset used is a credit risk dataset with 32,581 observations, which was constructed into three types: mixed numerical–categorical data (Data 1), purely categorical data (Data 2), and purely numerical data (Data 3). The analysis included preprocessing, stratified data splitting, application of synthetic oversampling (SMOTE, Borderline-SMOTE, ADASYN), model construction using Random Forest and XGBoost, and evaluation using balanced accuracy, precision, recall, and F1-score metrics. The results show that synthetic oversampling can improve the representation of the minority class,but its effectiveness strongly depends on the data characteristics and the algorithm applied. Random Forest tends to be more stable on categorical data, while XGBoost performs better on numerical and mixed data with the support of internal balancing. Therefore, the choice of balancing strategy and algorithm should be adjusted to the data structure in order to achieve optimal classification results under class imbalanced conditions.
       
      URI
      http://repository.ipb.ac.id/handle/123456789/171079
      Collections
      • UT - Statistics and Data Sciences [82]

      Copyright © 2020 Library of IPB University
      All rights reserved
      Contact Us | Send Feedback
      Indonesia DSpace Group 
      IPB University Scientific Repository
      UIN Syarif Hidayatullah Institutional Repository
      Universitas Jember Digital Repository
        

       

      Browse

      All of IPB RepositoryCollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

      My Account

      Login

      Application

      google store

      Copyright © 2020 Library of IPB University
      All rights reserved
      Contact Us | Send Feedback
      Indonesia DSpace Group 
      IPB University Scientific Repository
      UIN Syarif Hidayatullah Institutional Repository
      Universitas Jember Digital Repository