Please use this identifier to cite or link to this item: http://repository.ipb.ac.id/handle/123456789/169168
Title: PERBANDINGAN KINERJA ALGORITMA DECISION TREE DAN RANDOM FOREST UNTUK KLASIFIKASI PRODUK LARIS E-COMMERCE
Other Titles: A Comparative Study of Decision Tree and Random Forest Algorithms for Best-Selling Product Classification in E-Commerce.
Authors: Wijaya, Sony Hartono
Ardiansyah, Firman
Erari, Ferdi B. M.
Issue Date: 2025
Publisher: IPB University
Abstract: Penelitian ini membandingkan kinerja algoritma Decision Tree dan Random Forest dalam mengklasifikasikan produk laris menggunakan data penjualan e-commerce periode November 2020–Oktober 2023. Data dianalisis melalui eksplorasi, praproses (imputasi nilai kosong, encoding fitur kategorikal, pembagian data 70:30), dan pelabelan produk laris berdasarkan skor gabungan frekuensi transaksi dan total unit terjual dengan ambang kuartil ketiga (Q3). Model awal dibangun menggunakan parameter bawaan, kemudian dioptimasi melalui hyperparameter tuning dengan GridSearchCV dan RandomizedSearchCV pada parameter kritis. Evaluasi menggunakan akurasi, presisi, recall, F1-score, dan confusion matrix. Hasil menunjukkan Random Forest default unggul dengan akurasi 92,56%, presisi 92,10%, recall 92,56%, dan F1-score 92,04%. Setelah tuning, akurasi menjadi 91,82%, presisi 92,59%, recall 98,39%, dan F1-score 95,40%. Peningkatan recall menandakan kemampuan deteksi produk laris yang lebih baik, meski akurasi sedikit menurun. Random Forest, khususnya setelah tuning, direkomendasikan untuk mendukung pengambilan keputusan berbasis data di e-commerce, seperti manajemen inventaris dan strategi pemasaran.
This study compares the performance of Decision Tree and Random Forest algorithms in classifying best-selling products using e-commerce sales data from November 2020 to October 2023. The data underwent exploratory analysis, preprocessing (missing value imputation, categorical feature encoding, 70:30 train-test split), and product labeling based on a combined score of transaction frequency and total units sold, with the third quartile (Q3) as the threshold. Baseline models were built using default parameters, then optimized through hyperparameter tuning with GridSearchCV and RandomizedSearchCV on key parameters. Evaluation metrics included accuracy, precision, recall, F1-score, and confusion matrix analysis. Results showed that the default Random Forest outperformed Decision Tree, achieving 92.56% accuracy, 92.10% precision, 92.56% recall, and a 92.04% F1-score. After tuning, Random Forest achieved 91.82% accuracy, 92.59% precision, 98.39% recall, and a 95.40% F1-score. The increase in recall indicates improved ability to detect best-selling products, although accuracy slightly decreased due to more false positives. Random Forest, particularly after tuning, is recommended for data-driven decision-making in e-commerce, such as inventory management and marketing strategy.
URI: http://repository.ipb.ac.id/handle/123456789/169168
Appears in Collections:UT - Computer Science

Files in This Item:
File Description SizeFormat 
cover_G64180112_cdba6916dbfe4845acb772a4250a0920.pdfCover1.58 MBAdobe PDFView/Open
fulltext_G64180112_354f72abbf25472fa369d0f594fe7594.pdf
  Restricted Access
Fulltext6.09 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.