Comparison of Dynamic Factor Models (DFM) and Long Short-Term Memory (LSTM) Networks in Forecasting Household Consumption
Date
2022Author
Ra, Az Zahra Amon
Notodiputro, Khairil Anwar
Sartono, Bagus
Metadata
Show full item recordAbstract
Forecasting has been done to predict future conditions. The data used for forecasting is time series data. However, often the response and explanatory variables in time series data are available at different frequencies. Several models that can relate response variables and explanatory variables with different frequencies for forecasting are Dynamic Factor Model (DFM) and Long Short Term Memory Networks (LSTM). The purpose of this study is to evaluate the DFM and LSTM method with a simulation study and obtain the forecast value of household consumption for the next one year with empirical data. This study uses data simulation and empirical data. In the simulation study, the data used is data obtained from simulation process. There are two types of relationships between response variables and dynamic factors generated, namely linear relationships and non-linear (quadratic) relationships. The relationship between the response variable and the different dynamic factors aims to see differences in the performance of the LSTM and DFM methods. LSTM method is known as a method that can overcome non-linear relationships with the help of activation functions. In contrast to LSTM, DFM is included as a linear model. The scenario determined is based on period length, factor rotations and coefficients in the first lag of the VAR(2) equation. The generated data is data that follows a dynamic factor model consisting of eight explanatory variables. The eight explanatory variables are considered to be driven by three dynamic factors. Each set of generated data was analyzed by DFM and LSTM to obtain MAPE. The simulation results showed that the model and the period length used have a significant effect on the forecasting error at a significant level of 0.05. Meanwhile, the effect of factor rotation, VAR(2) coefficient in the first lag, and interactions between treatments are not significantly different. There are eight indicators related to household consumption were involved in the empirical data study. Before the modelling is carried out, the data needs to be divided into two parts, namely training data and testing data. The training data is data from the period January 2011 to December 2019 while the testing data is data from the period January to November 2020. The empirical data study showed that using factor rotation can produce a more stable forecast value. In addition, LSTM can predict better than DFM. It is estimated that there will be an increase in household consumption expenditure in the fourth quarter of 2020 to the third quarter of 2021.