Small Area Estimation for Lognormal Data with Spatial Dependence among Small Areas
Date
2022-12-16Author
Handayani, Dian
Notodiputro, Khairil Anwar
Saefuddin, Asep
Mangku, I Wayan
Metadata
Show full item recordAbstract
The need for accurate statistics is currently increasing, not only for the whole population, but also for some parts (subs) of population. Data from a survey can be used as information source for producing the statistics. Unfortunately, most of surveys are usually designed to produce some statistics for a whole population. Consequently, the survey data is often too small (or could be zero) for producing some statistics by direct estimation method in some subpopulations, especially for those are not planned in the survey. Statistics which are produced by direct estimation method will have large standard error if they are based on very small sample size. Moreover, if the sample size is zero then the statistics by direct estimation method cannot be obtained. The subs of population which the selected sample from it is not large enough for producing reliable direct estimates is known by small area. A statistical method that focuses on estimating some parameters in a small area is called by Small Area Estimation (SAE). To increase the effectiveness of small sample size in small areas, it is necessary to consider indirect estimation method which can be implemented by ‘borrowing strength’, utilizing some information from some resources, such as censuses, surveys or administrative records. Furthermore, ‘borrowing strength’ also could be implemented by exploiting data structures, such as considering the possibility of spatial dependence among small areas. SAE is a method for estimating parameter which is model-based. The standard SAE models, which are usually mixed models, are employed for estimating a linear small area parameter. The model assumes that the variable of interest follows normal distribution and there is no correlation among small areas. This dissertation aims to develop the best predictor of linear small area parameter as well as nonlinear small area parameter whenever the variable of interest follows positively skew distribution and among small areas is spatially correlated. A linear small area parameter is a linear function of the population values of a variable of interest in a given small area. On the other hand, a nonlinear small area parameter is a nonlinear function of the population values of a variable of interest. A total or mean population of a variable of interest in a given small area is an example of linear small area parameter whereas a ratio or quantile is a nonlinear parameter. The estimates of linear small area parameter as well as nonlinear small area parameter are derived under our proposed model, namely a unit-level spatial lognormal mixed model. This model assumes that the variable of interest follows log normal distribution and among small areas are spatially correlated by following Simultaneously Autoregressive (SAR) process. To predict the values of variable of interest for non-sampled units, it has been proposed the spatial synthetic estimator (SYNT) and the Spatial Empirical Best Predictor (SEBP). The SYNT predicts the non-sampled values by its unconditional expectation whereas the SEBP predicts them by its conditional expectation. The conditional expectation for obtaining the SEBP of linear small area parameter is calculated analytically. On the other hand, the conditional expectation for obtaining the SEBP of nonlinear small area parameter is approximated by Monte Carlo simulation because the conditional expectation involves high dimensional integral and complexity nonlinear function. The performance of the SEBP of linear small area parameter as well as the SEBP of nonlinear small area parameter, in terms of average relative bias (ARB) and average relative root mean square error (ARRMSE), are evaluated by simulation studies. The results of simulation studies indicate that the SEBP of linear small area parameter, compared to the spatial synthetic estimates, direct estimates, empirical best linear unbiased predictor (EBLUP) and empirical best predictor (EBP), has not only the small enough ARB but also has the smallest ARRMSE over the small, medium and large spatial correlation. Likewise, the simulation studies which are carried out to evaluate the performance of the SEBP of nonlinear small area parameter, compare to the direct estimates and EBP, indicate that the SEBP has good ARB and ARRMSE for all of simulation scenarios. The SEBP of linear small area parameter has been applied for estimating the average of monthly household per capita expenditure for each kecamatan (district) in Kabupaten Bogor (Bogor Regency) and Kota Bogor (Bogor Municipality). The estimates are based on data from Survey Sosial Ekonomi Nasional/SUSENAS 2018 (National Socio-Economic Survey 2018) and Potensi Desa/PODES 2018 (Village Potential 2018). On the other hand, the SEBP of nonlinear small area parameter has been applied for estimating the FGT (Foster Greer Thorbecke) poverty indicators for each district in Bogor Regency and Bogor Municipality. The estimates of FGT are calculated by utilizing data from National Socio-Economic Survey 2007 (SUSENAS 2007) and the Village Potential 2008 (PODES 2008). Kebutuhan statistik yang akurat saat ini semakin meningkat, tidak hanya
untuk suatu populasi secara menyeluruh, tetapi juga untuk bagian (sub) dari suatu
populasi. Data hasil suatu survey dapat digunakan sebagai salah satu sumber untuk
menghasilkan statistik. Namun seringkali ketersediaan data contoh (sample) hasil
survey sangat sedikit (atau bahkan tidak ada) pada beberapa bagian/subpopulasi,
terutama pada subpopulasi yang tidak direncanakan pada awal survey. Hal ini
disebabkan survey umumnya dirancang untuk menghasilkan statistik bagi suatu
populasi secara keseluruhan.
Ukuran contoh yang sangat kecil pada suatu subpopulasi, jika digunakan
untuk menghasilkan statistik berdasarkan metode pendugaan langsung (direct
estimation) akan menghasilkan keragaman yang besar dan mengakibatkan statistik
tersebut menjadi tidak reliabel. Terlebih lagi, jika pada suatu subpopulasi tidak
tersedia amatan contoh, maka statistik pada subpopulasi tersebut tidak akan dapat
dihasilkan.
Subpopulasi dengan karakteristik ukuran contoh yang kecil untuk
dihasilkannya penduga langsung yang reliabel dinamakan area kecil (small area).
Metode statistika yang menitikberatkan perhatiannya pada pendugaan parameter
pada area kecil dinamakan Small Area Estimation (SAE). Untuk meningkatkan
efektifitas kecilnya ukuran contoh pada area kecil, pendugaan parameter dapat
dilakukan dengan menggunakan metode pendugaan tidak langsung (indirect
estimation borrowing strength
memanfaatkan sumber-sumber informasi yang tersedia, misalnya sensus, survey
melalui
eksploitasi struktur data yang tersedia seperti mempertimbangkan kemungkinan
adanya ketergantungan spasial antar area.
SAE merupakan metode pendugaan parameter yang didasarkan pada suatu
model statistik (model-based). Model statistik standar dalam SAE, yang pada
umumnya merupakan model campuran (mixed model), digunakan untuk menduga
parameter linier area kecil. Model standar SAE mengasumsikan bahwa peubah
yang menjadi perhatian menyebar normal dan antar area kecil bersifat saling bebas.
Penelitian ini bertujuan mengembangkan penduga terbaik bagi suatu
parameter area kecil, baik parameter linier maupun tak linier dimana peubah yang
menjadi perhatian (variable of interest) tidak menyebar normal, namun menjulur
ke arah kanan (menjulur ke nilai-nilai positif/positively skewed), dan antar area
kecil saling bergantung spasial.
Parameter linier area kecil adalah parameter yang merupakan fungsi linier
dari nilai-nilai peubah yang menjadi perhatian yang diamati pada suatu area kecil,
sedangkan parameter tak linier area kecil adalah fungsi nonlinier dari nilai-nilai
peubah yang menjadi perhatian pada suatu area kecil. Total atau rata-rata populasi
dari suatu peubah yang menjadi perhatian pada suatu area kecil termasuk kategori
parameter linier area kecil, sedangkan rasio atau quantil dari suatu peubah tergolong
parameter tak linier.