PERBANDINGAN KINERJA ALGORITMA DECISION TREE DAN RANDOM FOREST UNTUK KLASIFIKASI   PRODUK LARIS E-COMMERCE

Erari, Ferdi B. M.

Please use this identifier to cite or link to this item: http://repository.ipb.ac.id/handle/123456789/169168

Title:	PERBANDINGAN KINERJA ALGORITMA DECISION TREE DAN RANDOM FOREST UNTUK KLASIFIKASI PRODUK LARIS E-COMMERCE
Other Titles:	A Comparative Study of Decision Tree and Random Forest Algorithms for Best-Selling Product Classification in E-Commerce.
Authors:	Wijaya, Sony Hartono Ardiansyah, Firman Erari, Ferdi B. M.
Issue Date:	2025
Publisher:	IPB University
Abstract:	Penelitian ini membandingkan kinerja algoritma Decision Tree dan Random Forest dalam mengklasifikasikan produk laris menggunakan data penjualan e-commerce periode November 2020–Oktober 2023. Data dianalisis melalui eksplorasi, praproses (imputasi nilai kosong, encoding fitur kategorikal, pembagian data 70:30), dan pelabelan produk laris berdasarkan skor gabungan frekuensi transaksi dan total unit terjual dengan ambang kuartil ketiga (Q3). Model awal dibangun menggunakan parameter bawaan, kemudian dioptimasi melalui hyperparameter tuning dengan GridSearchCV dan RandomizedSearchCV pada parameter kritis. Evaluasi menggunakan akurasi, presisi, recall, F1-score, dan confusion matrix. Hasil menunjukkan Random Forest default unggul dengan akurasi 92,56%, presisi 92,10%, recall 92,56%, dan F1-score 92,04%. Setelah tuning, akurasi menjadi 91,82%, presisi 92,59%, recall 98,39%, dan F1-score 95,40%. Peningkatan recall menandakan kemampuan deteksi produk laris yang lebih baik, meski akurasi sedikit menurun. Random Forest, khususnya setelah tuning, direkomendasikan untuk mendukung pengambilan keputusan berbasis data di e-commerce, seperti manajemen inventaris dan strategi pemasaran. This study compares the performance of Decision Tree and Random Forest algorithms in classifying best-selling products using e-commerce sales data from November 2020 to October 2023. The data underwent exploratory analysis, preprocessing (missing value imputation, categorical feature encoding, 70:30 train-test split), and product labeling based on a combined score of transaction frequency and total units sold, with the third quartile (Q3) as the threshold. Baseline models were built using default parameters, then optimized through hyperparameter tuning with GridSearchCV and RandomizedSearchCV on key parameters. Evaluation metrics included accuracy, precision, recall, F1-score, and confusion matrix analysis. Results showed that the default Random Forest outperformed Decision Tree, achieving 92.56% accuracy, 92.10% precision, 92.56% recall, and a 92.04% F1-score. After tuning, Random Forest achieved 91.82% accuracy, 92.59% precision, 98.39% recall, and a 95.40% F1-score. The increase in recall indicates improved ability to detect best-selling products, although accuracy slightly decreased due to more false positives. Random Forest, particularly after tuning, is recommended for data-driven decision-making in e-commerce, such as inventory management and marketing strategy.
URI:	http://repository.ipb.ac.id/handle/123456789/169168
Appears in Collections:	UT - Computer Science

Files in This Item:

File	Description	Size	Format
cover_G64180112_cdba6916dbfe4845acb772a4250a0920.pdf	Cover	1.58 MB	Adobe PDF	View/Open
fulltext_G64180112_354f72abbf25472fa369d0f594fe7594.pdf Restricted Access	Fulltext	6.09 MB	Adobe PDF	View/Open

Show full item record Recommend this item

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets