Please use this identifier to cite or link to this item:
http://repository.ipb.ac.id/handle/123456789/171021| Title: | Missing Data Imputation For Lstm-Based Early Warning System |
| Other Titles: | |
| Authors: | Sartono, Bagus Aidi, Muhammad Nur Khairunnisa, Akmarina |
| Issue Date: | 2025 |
| Publisher: | IPB University |
| Abstract: | Urban flooding, particularly in densely populated regions like Jakarta, is a critical challenge due to inadequate groundwater recharge and overflow from major rivers such as the Ciliwung. Existing early warning systems like Jakarta Flood Early Warning System (J-FEWS) depend heavily on multi-source hydrometeorological data, but are hindered by data gaps, particularly in high-frequency river water level recordings. This thesis explores a data-driven alternative that relies solely on historical water level sensor data, emphasizing the reconstruction of missing data and the application of advanced forecasting models. The study investigates the impact of missing data imputation on the accuracy of water level forecasting using the Interpretable Multi-Variable Long Short-Term Memory (IMV-LSTM) model. It includes both simulation and empirical analyses to evaluate various imputation approaches, including univariate methods (Kalman-Structural and Kalman-ARIMA), multivariate-local methods (MICE and kNN), and multivariate-global methods (SVD and PPCA). Ten complete multivariate time-series datasets were used in the simulation. A total of 18 data removal scenarios were applied across these datasets, and the process was repeated five times, resulting in 900 incomplete datasets used for testing each imputation method. Simulation results show that univariate methods, particularly Kalman-Structural and Kalman-ARIMA, outperform other methods in handling point missing data though their performance diminishes with larger missing gaps. For the empirical study, these two leading imputation methods were applied to real-world Ciliwung River water level data containing 5% missing values across four observation points. The goal was to determine which method yields the highest forecasting accuracy when paired with the IMV-LSTM model. The model using Kalman-Structural-imputed data achieved the best performance, reducing RMSE by 32% and MAPE by 50% compared to forecasts using unimputed data, with a final error of just 1.2 cm. The findings confirm that incorporating robust imputation as a pre-forecasting stage significantly enhances forecast accuracy. The IMV-LSTM model, combined with Kalman-based imputation, demonstrates that reliable flood prediction is possible even without external meteorological inputs. This work offers a scalable and interpretable alternative to implemented systems like J-FEWS, particularly valuable for regions with incomplete or limited data. |
| URI: | http://repository.ipb.ac.id/handle/123456789/171021 |
| Appears in Collections: | MT - School of Data Science, Mathematic and Informatics |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| cover_G1501221001_9faac067752a498a800528c4ccbacf08.pdf | Cover | 2.8 MB | Adobe PDF | View/Open |
| fulltext_G1501221001_d04632f49d9e4db89f854efa881698e6.pdf Restricted Access | Fulltext | 3.23 MB | Adobe PDF | View/Open |
| lampiran_G1501221001_60c46f930a6b487095480e280b35f507.pdf Restricted Access | Lampiran | 4.39 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.