Please use this identifier to cite or link to this item:
http://repository.ipb.ac.id/handle/123456789/169168| Title: | PERBANDINGAN KINERJA ALGORITMA DECISION TREE DAN RANDOM FOREST UNTUK KLASIFIKASI PRODUK LARIS E-COMMERCE |
| Other Titles: | A Comparative Study of Decision Tree and Random Forest Algorithms for Best-Selling Product Classification in E-Commerce. |
| Authors: | Wijaya, Sony Hartono Ardiansyah, Firman Erari, Ferdi B. M. |
| Issue Date: | 2025 |
| Publisher: | IPB University |
| Abstract: | Penelitian ini membandingkan kinerja algoritma Decision Tree dan Random Forest dalam mengklasifikasikan produk laris menggunakan data penjualan e-commerce periode November 2020–Oktober 2023. Data dianalisis melalui eksplorasi, praproses (imputasi nilai kosong, encoding fitur kategorikal, pembagian data 70:30), dan pelabelan produk laris berdasarkan skor gabungan frekuensi transaksi dan total unit terjual dengan ambang kuartil ketiga (Q3). Model awal dibangun menggunakan parameter bawaan, kemudian dioptimasi melalui hyperparameter tuning dengan GridSearchCV dan RandomizedSearchCV pada parameter kritis. Evaluasi menggunakan akurasi, presisi, recall, F1-score, dan confusion matrix. Hasil menunjukkan Random Forest default unggul dengan akurasi 92,56%, presisi 92,10%, recall 92,56%, dan F1-score 92,04%. Setelah tuning, akurasi menjadi 91,82%, presisi 92,59%, recall 98,39%, dan F1-score 95,40%. Peningkatan recall menandakan kemampuan deteksi produk laris yang lebih baik, meski akurasi sedikit menurun. Random Forest, khususnya setelah tuning, direkomendasikan untuk mendukung pengambilan keputusan berbasis data di e-commerce, seperti manajemen inventaris dan strategi pemasaran. This study compares the performance of Decision Tree and Random Forest algorithms in classifying best-selling products using e-commerce sales data from November 2020 to October 2023. The data underwent exploratory analysis, preprocessing (missing value imputation, categorical feature encoding, 70:30 train-test split), and product labeling based on a combined score of transaction frequency and total units sold, with the third quartile (Q3) as the threshold. Baseline models were built using default parameters, then optimized through hyperparameter tuning with GridSearchCV and RandomizedSearchCV on key parameters. Evaluation metrics included accuracy, precision, recall, F1-score, and confusion matrix analysis. Results showed that the default Random Forest outperformed Decision Tree, achieving 92.56% accuracy, 92.10% precision, 92.56% recall, and a 92.04% F1-score. After tuning, Random Forest achieved 91.82% accuracy, 92.59% precision, 98.39% recall, and a 95.40% F1-score. The increase in recall indicates improved ability to detect best-selling products, although accuracy slightly decreased due to more false positives. Random Forest, particularly after tuning, is recommended for data-driven decision-making in e-commerce, such as inventory management and marketing strategy. |
| URI: | http://repository.ipb.ac.id/handle/123456789/169168 |
| Appears in Collections: | UT - Computer Science |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| cover_G64180112_cdba6916dbfe4845acb772a4250a0920.pdf | Cover | 1.58 MB | Adobe PDF | View/Open |
| fulltext_G64180112_354f72abbf25472fa369d0f594fe7594.pdf Restricted Access | Fulltext | 6.09 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.