Please use this identifier to cite or link to this item: http://repository.ipb.ac.id/handle/123456789/163361
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorSuharjo, Budi-
dc.contributor.advisorRuhiyat-
dc.contributor.authorAlya, Hana-
dc.date.accessioned2025-07-01T02:32:18Z-
dc.date.available2025-07-01T02:32:18Z-
dc.date.issued2025-
dc.identifier.urihttp://repository.ipb.ac.id/handle/123456789/163361-
dc.description.abstractPenelitian ini membandingkan regresi logistik dan random forest dalam memprediksi churn pada pemegang polis asuransi mobil dengan teknik undersampling proporsi 30%, 50%, dan 70%. Model random forest setelah undersampling 30% menunjukkan akurasi (99,07%) dan AUC tertinggi (99,91%), sementara sensitivitas tertinggi (98,85%) pada model random forest sebelum undersampling. F1-score tertinggi (99,29%) diperoleh dari model random forest setelah undersampling 70%. Variabel paling penting yang memengaruhi keputusan pemegang polis untuk melakukan churn dalam model regresi logistik berdasarkan nilai odds ratio adalah status pekerjaan (pensiun), jumlah pembayaran bulanan pemegang polis asuransi (kategori 500-600), dan tingkat pendidikan tertinggi pemegang polis asuransi (doktor). Pada model random forest, variabel paling penting berdasarkan nilai mean decrease gini adalah tipe penawaran perpanjangan polis, status pekerjaan, dan tingkat pendidikan terakhir pemegang polis asuransi. Berdasarkan analisis nilai SHAP, variabel seperti status pekerjaan (pensiun), tipe penawaran perpanjangan polis (tipe 4), dan jumlah pembayaran per bulan (500-600) secara konsisten meningkatkan peluang churn.-
dc.description.abstractThis study compares logistic regression and random forest models in predicting churn among car insurance policyholders using undersampling techniques at proportions of 30%, 50%, and 70%. The random forest model with 30% undersampling achieved the highest accuracy (99.07%) and AUC (99.91%), while the highest sensitivity (98.85%) was observed in the random forest model before undersampling. The highest F1-score (99.29%) was obtained from the random forest model with 70% undersampling. The most influential variables affecting policyholders churn decisions in the logistic regression model, based on odds ratio values, were employment status (retired), monthly insurance payment amount (category 500–600), and highest level of education (doctoral degree). In the random forest model, the most important variables based on mean decrease in Gini were type of policy renewal offer, employment status, and highest level of education. Based on the SHAP value analysis, variables such as employment status (retired), policy renewal offer type (type 4), and monthly payment amount (500–600) consistently increased the likelihood of churn.-
dc.description.sponsorshipnull-
dc.language.isoid-
dc.publisherIPB Universityid
dc.titlePerbandingan Metode Regresi Logistik dan Random Forest dalam Memprediksi Churn pada Pemegang Polis Asuransi Mobilid
dc.title.alternativeComparison of Logistic Regression and Random Forest Methods in Predicting Churn Among Auto Insurance Policyholders-
dc.typeSkripsi-
dc.subject.keywordrandom forestid
dc.subject.keywordregresi logistikid
dc.subject.keywordundersamplingid
dc.subject.keyworddata churnid
dc.subject.keywordmetode supervised learningid
Appears in Collections:UT - Actuaria

Files in This Item:
File Description SizeFormat 
cover_G5402211069_af26211011684ff692fd32834fe05acf.pdfCover826.84 kBAdobe PDFView/Open
fulltext_G5402211069_24ab0d6696604d049e12d522d94ae9ac.pdf
  Restricted Access
Fulltext5.46 MBAdobe PDFView/Open
lampiran_G5402211069_adaefcd403cf43de9c4051670477a7f3.pdf
  Restricted Access
Lampiran2.45 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.