Perbandingan Metode Distributed Lag, Autoencoder, dan LSTM Autoencoder dalam Mendeteksi Anomali pada Indeks Kualitas Udara Jakarta

Pangestika, Adelia Putri

Please use this identifier to cite or link to this item: http://repository.ipb.ac.id/handle/123456789/171362

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Angraini, Yenni	-
dc.contributor.advisor	Sumertajaya, I Made	-
dc.contributor.author	Pangestika, Adelia Putri	-
dc.date.accessioned	2025-10-21T23:23:29Z	-
dc.date.available	2025-10-21T23:23:29Z	-
dc.date.issued	2025	-
dc.identifier.uri	http://repository.ipb.ac.id/handle/123456789/171362	-
dc.description.abstract	Anomali merupakan amatan yang sangat menyimpang dibandingkan amatan lainnya. Anomali dapat dipandang sebagai amatan tidak diinginkan yang harus dihilangkan, tetapi juga dapat dipandang sebagai amatan menarik yang penting untuk dideteksi. Pendeteksian anomali dapat dilakukan dengan pendekatan data deret waktu menggunakan metode statistika seperti distributed lag dan metode deep learning seperti autoencoder dan LSTM autoencoder. Metode statistika memiliki kelebihan dalam hal basis model yang jelas dan kompleksitas komputasi yang baik. Sementara itu, metode deep learning memiliki kelebihan dalam hal mampu menangani pola nonlinear pada data berdimensi tinggi, mampu melakukan prediksi tanpa harus menetapkan model parametrik yang spesifik, dan mampu beradaptasi dengan lebih baik dibandingkan metode statistika. Pendeteksian anomali penting untuk dilakukan pada berbagai aspek kehidupan, salah satunya pada indeks kualitas udara Jakarta. Anomali pada indeks kualitas udara Jakarta penting untuk dideteksi sebagai suatu acuan untuk merumuskan kebijakan sekaligus peringatan dini akan bencana pencemaran udara yang mungkin terjadi di masa yang akan datang. Data indeks kualitas udara Jakarta merupakan data deret waktu sehingga pendeteksian anomali dapat dilakukan dengan pendekatan metode statistika seperti distributed lag dan metode deep learning seperti autoencoder dan LSTM autoencoder. Metode autoencoder dan LSTM autoencoder terbukti efektif dalam mendeteksi anomali, tetapi perbandingan performa keduanya dalam satu kajian masih jarang dilakukan. Performa kedua metode ini juga belum pernah dibandingkan dengan metode statistika yaitu distributed lag. Baik metode autoencoder, LSTM autoencoder, maupun distributed lag, ketiganya belum banyak digunakan untuk mendeteksi anomali pada data indeks kualitas udara, khususnya Jakarta. Selain itu, performa pendeteksian anomali dari ketiga metode ini juga belum banyak dikaji dengan memperhatikan pengaruh berbagai karakteristik anomali seperti persentase dan kedalaman anomali. Penelitian ini bertujuan untuk mengevaluasi performa metode distributed lag, autoencoder, dan LSTM autoencoder dalam mendeteksi anomali pada data indeks kualitas udara Jakarta. Selain itu, penelitian ini juga bertujuan untuk mengevaluasi pengaruh persentase dan posisi anomali terhadap performa pendeteksian anomali yang dihasilkan oleh ketiga metode. Proses pendeteksian anomali pada ketiga metode dilakukan berdasarkan hasil peramalan menggunakan aturan 4s. Pada aturan 4s, amatan akan teridentifikasi sebagai anomali apabila nilai error peramalannya lebih dari batas atas atau kurang dari batas bawah yang telah ditentukan. Hasil pendeteksian anomali menggunakan ketiga metode menunjukkan, metode distributed lag memiliki performa pendeteksian anomali yang paling unggul dengan karakteristik pendeteksian anomali yang sensitif. Pendeteksian anomali menggunakan metode ini cenderung menghasilkan nilai false positive yang tinggi, false negative yang rendah, dan balanced accuracy yang tinggi. Sebaliknya, metode autoencoder memiliki performa pendeteksian anomali yang paling buruk dengan karakteristik pendeteksian anomali yang selektif. Pendeteksian anomali menggunakan metode ini cenderung menghasilkan nilai false positive yang rendah, false negative yang tinggi, dan balanced accuracy yang rendah. Sementara itu, metode LSTM autoencoder memiliki performa pendeteksian anomali yang lebih baik dibandingkan metode autoencoder tetapi tidak lebih baik dibandingkan metode distributed lag. Pendeteksian anomali menggunakan metode ini cenderung menghasilkan nilai false positive yang tidak terlalu tinggi, false negative yang rendah, dan balanced accuracy yang cukup tinggi. Perbedaan performa pendeteksian anomali pada ketiga metode juga dipengaruhi oleh perbedaan persentase dan kedalaman anomali. Semakin tinggi persentase anomali maka semakin buruk performa pendeteksian anomali yang dihasilkan. Jika memperhatikan performa pendeteksian anomali pada setiap persentase di setiap metode, metode distributed lag memiliki performa pendeteksian anomali yang paling unggul dan paling robust terhadap persentase anomali. Sebaliknya, metode autoencoder menjadi metode dengan performa pendeteksian anomali yang paling buruk, sementara metode LSTM autoencoder menjadi metode dengan performa pendeteksian anomali yang paling tidak robust terhadap persentase anomali. Berbanding terbalik dengan persentase anomali, semakin tinggi kedalaman anomali maka semakin baik performa pendeteksian anomali yang dihasilkan. Metode distributed lag menjadi metode dengan performa pendeteksian anomali yang paling unggul dan paling robust terhadap kedalaman anomali. Sebaliknya, metode autoencoder merupakan metode dengan performa pendeteksian anomali yang paling buruk dan paling tidak robust terhadap kedalaman anomali. Metode LSTM autoencoder kembali menjadi metode dengan performa yang lebih unggul dan lebih robust dibandingkan metode autoencoder tetapi tidak dapat mengungguli perfoma metode distributed lag.	-
dc.description.abstract	An anomaly is an observation that significantly deviates from other observations. Anomalies can be viewed as undesirable observations that must be eliminated, but they can also be viewed as interesting observations that are important to detect. Anomaly detection can be performed using time series data approaches using statistical methods such as distributed lag and deep learning methods like autoencoders and LSTM autoencoders. Statistical methods have the advantage of a clear model basis and good computational complexity. Meanwhile, deep learning methods have the advantage of handling nonlinear patterns in high-dimensional data, making predictions without defining a specific parametric model, and being more adaptable than statistical methods. Anomaly detection is important in various aspects of life, including the Jakarta air quality index. Anomalies in the Jakarta air quality index are crucial for detecting and serving as a reference for formulating policies and providing early warning of future air pollution disasters. Jakarta's air quality index data is time series data, so anomaly detection can be performed using statistical methods like distributed lag and deep learning methods like autoencoders and LSTM autoencoders. Autoencoder and LSTM autoencoder methods have proven effective in detecting anomalies, but comparisons of their performance in a single study are rare. The performance of these two methods has also never been compared with a statistical method, namely, the distributed lag. Neither autoencoder, LSTM autoencoder, nor distributed lag methods have been widely used to detect anomalies in air quality index data, particularly in Jakarta. Furthermore, the anomaly detection performance of these three methods has not been extensively studied, considering the influence of various anomaly characteristics, such as the percentage and depth of anomalies. This study aims to evaluate the performance of distributed lag, autoencoder, and LSTM autoencoder methods in detecting anomalies in Jakarta's air quality index data. Furthermore, this study aims to evaluate the effect of the percentage and position of anomalies on the anomaly detection performance of the three methods. The anomaly detection process for all three methods is based on forecasting results using the 4s rule. The 4s rule identifies an observation as an anomaly if the forecast error value exceeds a predetermined upper limit. Anomaly detection results using the three methods show that the distributed lag method performs best, with its sensitive anomaly detection characteristics. Anomaly detection using this method produces a high rate of false positives, a low rate of false negatives, and high balanced accuracy. Conversely, the autoencoder method performs the worst, with its selective anomaly detection characteristics. Anomaly detection using this method produces a low rate of false positives, a high rate of false negatives, and low balanced accuracy. Meanwhile, the LSTM autoencoder method has better anomaly detection performance than the autoencoder method, but not better than the distributed lag method. Anomaly detection using this method produces a moderate rate of false positives, a low rate of false negatives, and relatively high balanced accuracy. Differences also influence anomaly detection performance across the three methods regarding the percentage and depth of anomalies. The higher the percentage of anomalies, the worse the anomaly detection performance. When looking at the anomaly detection performance at each percentage in each method, the distributed lag method has the best performance and is most robust against the percentage of anomalies. On the other hand, the autoencoder method has the worst anomaly detection performance, while the LSTM autoencoder method has the least robust anomaly detection performance against the anomaly percentage. The higher the anomaly depth, the better the anomaly detection performance. The distributed lag method has the best anomaly detection performance and is the most robust against the anomaly depth. On the other hand, the autoencoder method has the worst anomaly detection performance and is the least robust against the anomaly depth. The LSTM autoencoder method again has better and more robust performance than the autoencoder method but cannot outperform the autoencoder method.	-
dc.description.sponsorship	Kementerian Pendidikan Tinggi, Sains, dan Teknologi Republik Indonesia	-
dc.language.iso	id	-
dc.publisher	IPB University	id
dc.title	Perbandingan Metode Distributed Lag, Autoencoder, dan LSTM Autoencoder dalam Mendeteksi Anomali pada Indeks Kualitas Udara Jakarta	id
dc.title.alternative	null	-
dc.type	Tesis	-
dc.subject.keyword	autoencoder	id
dc.subject.keyword	deteksi anomali	id
dc.subject.keyword	distributed lag	id
dc.subject.keyword	indeks kualitas udara	id
dc.subject.keyword	LSTM autoencoder	id
Appears in Collections:	MT - School of Data Science, Mathematic and Informatics

Files in This Item:

File	Description	Size	Format
cover_G1501231090_e3282c4e032c4528bdc9baec11b467fc.pdf	Cover	584.52 kB	Adobe PDF	View/Open
fulltext_G1501231090_3d1b4ea7bc2746278e1345e700333e1c.pdf Restricted Access	Fulltext	3.14 MB	Adobe PDF	View/Open
lampiran_G1501231090_c9c64668c1d54eabbba58545a7c83b1f.pdf Restricted Access	Lampiran	283.48 kB	Adobe PDF	View/Open

Show simple item record Recommend this item

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets