Studi Performa Adaptive Learning Rate Momentum Models dan SGD-M pada Klasifikasi Gambar CIFAR-10 Menggunakan ResNet-20
Abstract
Pelatihan deep neural networks dalam skala besar membutuhkan sumber daya komputasi tinggi, terutama dalam penyetelan learning rate. Stochastic Gradient Descent with Momentum (SGD-M) merupakan salah satu metode optimasi yang umum digunakan, tetapi performanya sangat bergantung pada pemilihan learning rate yang tepat. Pendekatan adaptif seperti Momentum Models (MoMo) dikembangkan untuk menyesuaikan learning rate secara otomatis sehingga mengurangi kebutuhan tuning hyperparameter. Penelitian ini membandingkan performa SGD-M dan MoMo dalam klasifikasi CIFAR-10 menggunakan ResNet-20. Penelitian menunjukkan bahwa MoMo lebih unggul dengan mencapai training loss lebih rendah (0.1137) dan validation accuracy lebih tinggi (89.57%) pada learning rate ??0 = 1, sementara SGD-M mencapai performa terbaik pada ??0 = 0.1 dengan training loss 0.1912 dan validation accuracy 87.41%. MoMo juga memperluas rentang learning rate yang baik dengan tetap stabil pada learning rate tinggi ??0 = 10, sedangkan SGD-M mengalami degradasi performa. Selain itu, MoMo lebih cepat dalam menurunkan training loss dan mencapai validation accuracy lebih tinggi sejak awal, menunjukkan adaptasi learning rate yang lebih efektif. Training deep neural networks on a large scale requires significant computational resources, particularly in optimizing the learning rate. Stochastic Gradient Descent with Momentum (SGD-M) is a commonly used optimization method, but depends heavily on proper learning rate selection. Adaptive approaches such as the Momentum Models (MoMo) adjust the learning rate automatically, reducing the need for hyperparameter tuning. This study compares the performance of SGD-M and MoMo in CIFAR-10 classification using the ResNet-20 architecture. The results show that MoMo achieves a lower training loss (0.1137) and higher validation accuracy (89.57%) at ??0 = 1, while SGD-M performs best at ??0 = 0.1 with a training loss of 0.1912 and a validation accuracy of 87.41%. MoMo also extends the range of effective learning rates, remaining stable even at a high learning rate ??0 = 10 , whereas SGD-M suffers performance degradation. Additionally, MoMo reduces training loss more quickly and reaches higher validation accuracy earlier, demonstrating more effective learning rate adaptation.
Collections
- UT - Mathematics [89]
