Please use this identifier to cite or link to this item:
http://repository.ipb.ac.id/handle/123456789/161542Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Saefuddin, Asep | |
| dc.contributor.advisor | Sumertajaya, I Made | |
| dc.contributor.advisor | Soleh, Agus Mohamad | |
| dc.contributor.advisor | Domiri, Dede Dirgahayu | |
| dc.contributor.author | Muradi, Hengki | |
| dc.date.accessioned | 2025-04-18T22:46:00Z | |
| dc.date.available | 2025-04-18T22:46:00Z | |
| dc.date.issued | 2025 | |
| dc.identifier.uri | http://repository.ipb.ac.id/handle/123456789/161542 | |
| dc.description.abstract | Support Vector Machine (SVM) adalah metode pembelajaran mesin yang dapat digunakan untuk klasifikasi dan regresi. Metode ini dikenal mampu menangani hubungan nonlinier antar peubah, robust terhadap multikolinieritas dan autokorelasi, serta bebas overfitting. Penelitian ini memiliki tiga tujuan utama: (1) mengevaluasi kinerja SVM dalam kasus regresi dengan peubah respon kontinu yang disebut dengan Support Vector Regression (SVR), (2) mengevaluasi kinerja SVM dalam kasus klasifikasi dengan peubah respon multinomial, dan (3) mengembangkan serta mengevaluasi metode Generalized Mixed Effect Support Vector Machine (GMESVM). Kajian pertama bertujuan untuk menilai linieritas hubungan antara Peubah prediktor dan peubah respon kontinu, serta menguji kemampuan SVR dalam menangani hubungan nonlinier. Data yang digunakan berupa data citra Sentinel-1 yang meliputi polarisasi vertical-vertical (VV) dan vertical-horizontal (VH) dan indeks polarisasi; ratio polarization index (RPI), normalized different polarization index (NDPI), dan average polarization index(API), serta umur padi pada blok sawah di PT. Sang Hyang Seri (SHS) Subang pada musim tanam pertama tahun 2022. Data teracak di bagi menjadi data training dan data testing dengan komposisi 70:30 dan diulang sebanyak 10 kali. Metode SVR dievaluasi dengan beberapa tipe kernel yaitu linier, polinomial, dan radial basis function (RBF) dan dibandingkan dengan metode regresi linier (LM). Model umur padi dari metode SVR lebih baik dibandingkan dengan model umur padi dari metode LR, dimana metode SVR menghasilkan model umur padi terbaik pada kondisi 4 (empat) prediktor yaitu VH,RPI,NDPI, dan API dengan rata-rata RMSE sebesar 11,03 dan rata-rata koefisien determinasi (adjust) sebesar 89,69%. Hasil penelitian menunjukkan bahwa SVR dengan kernel RBF memberikan performa terbaik dalam menangani pola hubungan nonlinier antara peubah prediktor dengan peubah respon. Kajian kedua bertujuan untuk mengevaluasi metode klasifikasi multiklas SVM One-versus-One (OvO) dan Generalized SVM (GenSVM) dengan berbagai tipe kernel dan parameter. Data simulasi merupakan data glass yang diperoleh dari pakcage kknn R sedangkan data empirik menggunakan data citra Sentinel-1 dan fase padi. Akurasi SVM OvO mencapai 100% ketika Cost = 214 dan akurasi model juga meningkat seiring kenaikan gamma dan mencapai optimal saat gamma = 24. Kinerja GenSVM mencapai optimal saat parameter kappa 2,5 atau 4,0, dan semakin optimal ketika parameter lambda semakin kecil, mencapai puncaknya pada ?=2^(-16). Selain itu, GenSVM juga menunjukkan kinerja terbaik saat parameter p disetel ke 1,7. Dengan demikian, kinerja SVM OvO dan GenSVM sangat bergantung pada pengaturan parameter namun terdapat resiko overfitting ketika parameter diatur terlalu esktrem sehingga diperlukan tuning parameter. Pada studi empiris, metode SVM OvO dan GenSVM dievaluasi dan dibandingkan dengan metode multinomial logistic regression (MLR). Hasil menunjukkan bahwa SVM OvO memberikan akurasi tertinggi pada skenario empat kelas dengan akurasi 79,20 ± 0,21%. Prediksi model menunjukkan kesalahan terutama pada fase air dan bera, akibat jumlah contoh yang tidak seimbang dan faktor-faktor eksternal. Kajian ketiga merupakan kajian utama dalam penelitian ini. Dalam kajian ini dikembangkan metode GMESVM untuk klasifikasi multinomial dengan menggabungkan SVM OvO dan Generalized Linear Mixed Model (GLMM) melalui pendekatan combination forecasting dengan pembobotan mean aritmatik, dan metode variance-covariance (vaco). Data simulasi dibangkitkan dengan mempertimbangkan faktor jumlah kelas peubah respon, jumlah Peubah prediktor, ukuran contoh, dan level korelasi antar peubah prediktor. Jumlah kelas peubah respon ditetapkan sebanyak 6 kelas, peubah prediktor terdiri dari 3 (tiga) peubah efek tetap dan 1 (satu) peubah efek acak, ukuran contoh terdiri dari n = 100, n = 400, dan n =1000, dan level korelasi diatur dengan ketentuan; Lemah (L) yang berarti korelasi kurang dari 0,4; Moderate (M) yang berarti korelasi 0,41-0,60; Kuat (K) yang berarti korelasi 0,61-0,80, dan Sangat Kuat (SK) yang berarti korelasi 0,81-1,00. Skenario level korelasi disusun menjadi 8 (delapan) skenario yaitu; L,L,L; L,L,M; L,L,K; L,L,SK; M,M,M; M,M,K; K,K,K; dan K,K,SK. Skenario L,LM bermakna r_(x_1,x_2 )=L, r_(x_1,x_3 )=L dan r_(x_2,x_3 )=M. Hasil simulasi menunjukkan bahwa peningkatan ukuran contoh berpengaruh positif terhadap kinerja metode SVM, di mana pada contoh n = 100, kinerja optimal tercapai pada skenario korelasi L,L,L hingga L,L,SK. Untuk ukuran contoh n = 400, SVM dapat berfungsi optimal di hampir semua skenario kecuali K,K,K, sedangkan pada n = 1000, kinerja terbaik ditemukan pada L,L,L dan K,K,K, meskipun menurun pada K,K,SK; model SVM tetap konsisten tanpa masalah overfitting. Sebaliknya, GLMM pada n = 100 tidak mencapai kinerja optimal, tetapi saat ukuran contoh meningkat menjadi n = 400, akurasi GLMM meningkat signifikan, mencapai di atas 90%, dengan kinerja terbaik pada skenario L,L,SK, meski terdeteksi masalah overfitting. Ketika ukuran contoh ditingkatkan lagi hingga n = 1000, GLMM menunjukkan kinerja yang sangat baik dan konsisten, serta mengurangi risiko overfitting. Hasil simulasi metode GMESVM menunjukkan bahwa metode ini lebih efektif ketika dibangun dari kombinasi prediksi peluang yang diperoleh dari SVM dan GLMM, serta dipengaruhi oleh ukuran contoh dan level korelasi Peubah prediktor. Pada ukuran contoh 100, GMESVM belum optimal tetapi mampu memperbaiki kesalahan prediksi SVM dan GLMM, dengan akurasi lebih dari 90% pada data training dan lebih dari 83% pada data testing. Ketika ukuran contoh meningkat menjadi 400 dan 1000, kinerja GMESVM meningkat signifikan, konsisten di atas 90% pada semua skenario, dan lebih baik dibandingkan dengan SVM dan GLMM karena dapat mengakomodasi hubungan non-linear dan efek lainnya. Model GMESVM terbaik diperoleh melalui pemilihan metode kombinasi yang optimal dengan cara ensemble. Pada studi empiris, GMESVM meningkatkan akurasi model fase pertumbuhan padi, dengan akurasi tertinggi pada GMESVM Mean Aritmatik rata-rata 80,95%, lebih baik dari SVM (80,38%) dan GLMM (79,01%). Kendala yang dihadapi adalah masih adanya kesalahan klasifikasi hingga 20%, terutama karena model belum mengakomodasi tren indeks polarisasi secara penuh. Kombinasi SVM dan GLMM dengan pendekatan kombinasi memberikan akurasi model yang lebih tinggi dan sangat potensial untuk aplikasi klasifikasi. Namun, penelitian lanjutan masih diperlukan, terutama untuk mengintegrasikan efek waktu dan spasial dalam model GMESVM guna meningkatkan kinerjanya. | |
| dc.description.abstract | Support Vector Machine (SVM) is a machine learning method that can be used for classification and regression. This method is known to be able to handle nonlinear relationships between variables, robust to multicollinearity and autocorrelation, and free of overfitting. This study has three main objectives: (1) evaluate the performance of SVM in the case of regression with continuous response variables called Support Vector Regression (SVR), (2) evaluate the performance of SVM in the case of classification with multinomial response variables, and (3) develop and evaluate the Generalized Mixed Effect Support Vector Machine (GMESVM) method. The first study aims to assess the linearity of the relationship between predictor variables and continuous response variables, and test the ability of SVR to handle nonlinear relationships. The data used is Sentinel-1 image data which includes vertical-vertical (VV) and vertical-horizontal (VH) polarisation and polarisation indices; ratio polarization index (RPI), normalised different polarization index (NDPI), and average polarization index (API), as well as the age of rice in the rice paddy block at PT Sang Hyang Seri (SHS) Subang in the first planting season of 2022. The data were divided into training data and testing data with a composition of 70:30 and repeated 10 times. The SVR method was evaluated with several kernel types namely linear, polynomial, and radial basis function (RBF) and compared with the linear regression (LM) method. The rice paddy age model from the SVR method is better than the rice paddy age model from the LR method, where the SVR method produces the best rice paddy age model in the condition of 4 (four) predictors namely VH, RPI, NDPI, and API with an average RMSE of 11.03 and an average coefficient of determination (adjust) of 89.69%. The results showed that SVR with RBF kernel gave the best performance in handling nonlinear relationship patterns between variables. The second study aims to evaluate the SVM One-versus-One (OvO) and Generalized SVM (GenSVM) multiclass classification methods with various kernel types and parameters. Simulated data is glass data obtained from kknn R pakcage while empirical data uses Sentinel-1 image data and rice phase. The accuracy of SVM OvO reached 100% when Cost = 214 and the accuracy of the model also increased as gamma increased and reached the optimum when gamma = 24. The performance of GenSVM reached the optimum when the kappa parameter was 2.5 or 4.0, and became more optimal when the lambda parameter was smaller, reaching its peak at ? = 2(-16). In addition, GenSVM also shows the best performance when the p parameter is set to 1.7. Thus, the performance of SVM OvO and GenSVM is highly dependent on the parameter settings but there is a risk of overfitting when the parameters are set too extreme so parameter tuning is required. In the empirical study, SVM OvO and GenSVM methods were evaluated and compared with multinomial logistic regression (MLR). Results showed that SVM OvO gave the highest accuracy in the four-class scenario with 79.20 ± 0.21% accuracy. Model predictions showed errors especially in the water and fallow phases, due to unbalanced sample size and external factors. The third study is the main study in this research. In this study, the GMESVM method for multinomial classification is developed by combining SVM OvO and Generalized Linear Mixed Model (GLMM) through a combination forecasting approach with arithmetic mean weighting, geometric mean, and variance-covariance method (vaco). Simulation data is generated by considering the number of response variable classes, the number of estimating variables, sample size, and the level of correlation between estimating variables. The number of response variable classes was set at 6 classes, the estimating variables consisted of 3 (three) fixed effect variables and 1 (one) random effect variable, the sample size consisted of n = 100, n = 400, and n = 1000, and the correlation level was set with the following provisions; Weak (L) which means a correlation of less than 0.4; Moderate (M) which means a correlation of 0.41-0.60; Strong (K) which means a correlation of 0.61-0.80, and Very Strong (SK) which means a correlation of 0.81-1.00. The correlation level scenario is organized into 8 (eight) scenarios, namely; L,L,L; L,L,M; L,L,K; L,L,SK; M,M,M; M,M,K; K,K,K; and K,K,SK. The scenario L,LM means r_(x_1,x_2 )=L, r_(x_1,x_3 )=L and r_(x_2,x_3 )=M. The simulation results show that increasing the sample size has a positive effect on the performance of the SVM method, where at sample n = 100, optimal performance is achieved in the L,L,L to L,L,SK correlation scenarios. For sample size n = 400, SVM can function optimally in almost all scenarios except K,K,K, while at n = 1000, the best performance is found in L,L,L and K,K,K, although it decreases in K,K,SK; the SVM model remains consistent without overfitting issues. In contrast, the GLMM at n = 100 did not achieve optimal performance, but as the sample size increased to n = 400, the accuracy of the GLMM improved significantly, reaching above 90%, with the best performance in the L,L,SK scenario, although an overfitting problem was detected. When the sample size was further increased to n = 1000, the GLMM showed excellent and consistent performance, and reduced the risk of overfitting. The simulation results of the GMESVM method show that it is more effective when built from a combination of probability predictions obtained from SVM and GLMM, and is influenced by the sample size and the correlation level of the estimating variables. At a sample size of 100, GMESVM is not optimal but is able to correct the prediction errors of SVM and GLMM, with an accuracy of more than 90% on training data and more than 83% on testing data. When the sample size increases to 400 and 1000, the performance of GMESVM improves significantly, consistently above 90% in all scenarios, and compares favorably with SVM and GLMM because it can accommodate non-linear relationships and other effects. The best GMESVM model is obtained through selecting the optimal combination method in an ensemble manner. In the empirical study, GMESVM improved the accuracy of the rice growth phase model, with the highest accuracy in GMESVM Mean Arithmetic averaging 80.95%, better than SVM (80.38%) and GLMM (79.01%). The obstacle is that there is still a misclassification of up to 20%, mainly because the model does not fully accommodate the trend of the polarization index. The combination of SVM and GLMM with the combination approach provides higher model accuracy and has great potential for classification applications. However, further research is still needed, especially to integrate time and spatial effects in the GMESVM model to improve its performance. | |
| dc.description.sponsorship | Kementerian Pendidikan dan Kebudayaan RI, LPDP dan Badan Riset dan Inovasi Nasional | |
| dc.language.iso | id | |
| dc.publisher | IPB University | id |
| dc.title | Generalized Mixed Effect Support Vector Machine (GMESVM) untuk Model Multinomial: Pengembangan dan Aplikasinya. | id |
| dc.title.alternative | Generalized Mixed Effect Support Vector Machine (GMESVM) Method for Multinomial Models: Development and Application | |
| dc.type | Disertasi | |
| dc.subject.keyword | SVM | id |
| dc.subject.keyword | Peramalan | id |
| dc.subject.keyword | glmm | id |
| dc.subject.keyword | GMESVM | id |
| dc.subject.keyword | multinomial | id |
| dc.subject.keyword | Kombinasi | id |
| Appears in Collections: | DT - School of Data Science, Mathematic and Informatics | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| cover_G161180031_ac038998628747ee8d60a9631ef70932.pdf | Cover | 5.41 MB | Adobe PDF | View/Open |
| fulltext_G161180031_dbc00fcf5cd34901ba69f9189fa79d57.pdf Restricted Access | Fulltext | 6.75 MB | Adobe PDF | View/Open |
| lampiran_G161180031_fe203db93fe643d7a1737840b5b2450c.pdf Restricted Access | Lampiran | 2.13 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.