Please use this identifier to cite or link to this item: http://repository.ipb.ac.id/handle/123456789/68405
Title: Analysis and Solving Outliers to Longitdinal Data
Authors: Indahwati
Kurnia, Anang
Eminita, Viarti
Issue Date: 2014
Abstract: Longitudinal study is characterized by individuals in the study are followed over a period of time and, for each subject, data are collected at multiple time points. The study is possible to learn the response changes over time with the factors that affects, both at the population level and individual level. The exact method used for analysis of longitudinal data base of linear model is general linear mixed models (GLMM). Estimation methods in linear mixed models are based on the assumption that the random effects and intra-subject error distributed normally. In practice, however, the assumed parametric distributions may not hold. Moreover, outliers may be present. Outlier observations lead to the normality of data distribution be disrupted, so covariance matrix will be inefficient and properties of estimators be biased and inconsistent (Yohai 2006). There are two types of outliers in longitudinal data, first outliers at individual level, sometimes called e-outliers, which arise among the repeated measurements within a given individual. Second outliers at population level, sometimes called b-outliers, which are unusual individuals in the sample. Kooler (2013) modify the Huber ψ-function more smooth than the classic Huber ψ-function which is numerically unstable and using DAS (Design Adaptive Scale) method to estimate the scale parameter and covariance matrix in linear mixed models. These method use weight function that not only depends on ρ-function used but use a constant κ that ensure consistency of the esimate. This study aims to assess the performance of the Huber robust estimator to intra-subject error and random intercept effect are both symmetric and nonsimetrik distribution through simulation data as affect any outliers in some data conditions. Furthermore, applying the best estimator method to resolving outliers in a clinical trial to compare the efficacy and safety of two antiretroviral drugs in HIV-infected patients, as well as predict the condition of the patient at any given time. Simulation was conducted to asses the effect of outlier contamination on longitudinal data with three conditions of contamination, the contamination of outliers on intra-subject error, the random intercept effect, and contamination in both. The simulations also examined the proportion of different outliers, there are proportion of 0% (without outliers), 5%, 10%, and 15%. In addition, the simulation was also conducted to asses the effect of non-compliance of normal distribution assumption on random intercept effect and intra-subject error. Both of these effects will be conditioned to spread t distribution representing symmetric distribution and spread chi-square distribution represents nonsimetric distribution. Simulation data are built based on linear mixed models with random intercept Evaluation is based on different data conditions and repeated 500 times. Applied data in this research is secondary data from a clinical trial (http://www.biostat.umn.edu/~brad/software.html) involving n = 467 HIV-infected patients were diagnosed AIDS or have CD4+ counts ≤ 300 ml3/blood. CD4+ counts are recorded at the entry study (t = 0), and again at the 2-, 6-, 12-, and 18-month visits (so that mi ≤ 5). Simulation study for both method showed that values of Relatif Bias (RB), Root Relatif Mean Square Error (RRMSE) and Average of Mean Absolute Persentage Error (MAPE) of estimator for linear mixed model with random intercept are greater with increasing proportion of outlier contamination is tested, spesially 15% proportion case. Robust estimation gave good performance in outlier contamination case than classical method, but they are same performance for intra-subject error and random intercept effect follow simetric and nonsimetric distribution. Random intercept contaminated by outliers or it follows simetric and nonsimetric distribution affect only characteristic of fixed intercept estimator. Generally, in this study the robust estimator method increase efficiency of prediction. Data of patient with HIV infection have many missing value, because some patient was not measured at five times, otherwise it looked also the different effect of time on patient's CD4+ counts for each patient. Some patients had a CD4+ counts decreased for each next visit, but there were also increasing in counts, so the mixed linear model used is the model with random intercept and slope. Many outlier canteminated data, both outlier in intra-subject error and spesific subject effects. Those problem must be solved for produce precise and accuracy prediction. Analyze to coefficient estimator showed that TIME and PrevOI variables affect significantly to CD4+ count. There is negative correlation between random intersept and random slope. This indicates that the decrease in CD4+ count among patients affected by the CD4+ count held at the beginning of study.
Suatu studi longitudinal dicirikan dengan percobaan yang pengukurannya dilakukan secara berulang antar waktu pada setiap individu. Pada studi ini dimungkinkan untuk mempelajari perubahan respon antar waktu beserta faktor- faktor yang mempengaruhi perubahan tersebut, baik pada level populasi maupun level individu. Metode yang dapat digunakan untuk analisis data longitudinal berbasis model linier adalah model linier campuran (GLMM). Metode pendugaan pada model linier campuran didasarkan pada asumsi bahwa pengaruh spesifik subyek dan galat intra-subyek menyebar normal. Namun, dalam berbagai kasus tidak jarang ditemui hal yang menyebabkan tidak terpenuhinya asumsi tersebut. Salah satu penyebab tidak terpenuhinya asumsi k arena ada pencilan pada data amatan. Pengamatan pencilan mengakibatkan kenormalan dari sebaran data menjadi terganggu, akibatnya matriks peragam akan kehilangan efisiensinya dan sifat penduga menjadi bias dan tidak konsisten (Yohai 2006 ). Pada data longitudinal terdapat dua jenis pencilan, yaitu pencilan pada galat intra-subyek dan pencilan pada pengaruh spesifik subjek
URI: http://repository.ipb.ac.id/handle/123456789/68405
Appears in Collections:MT - Mathematics and Natural Science

Files in This Item:
File Description SizeFormat 
2014vem.pdf
  Restricted Access
Fulltext2.7 MBAdobe PDFView/Open
BAB I Pendahuluan.pdf
  Restricted Access
BAB I476.41 kBAdobe PDFView/Open
BAB II Tinjauan Pustaka.pdf
  Restricted Access
BAB II1.12 MBAdobe PDFView/Open
BAB III Metode.pdf
  Restricted Access
BAB III854.86 kBAdobe PDFView/Open
BAB IV Hasil dan Pembahasan.pdf
  Restricted Access
BAB IV1.64 MBAdobe PDFView/Open
BAB V Simpulan dan Saran.pdf
  Restricted Access
BAB V279.51 kBAdobe PDFView/Open
Cover.pdf
  Restricted Access
Cover282.73 kBAdobe PDFView/Open
Daftar Pustaka.pdf
  Restricted Access
Daftar Pustaka504.1 kBAdobe PDFView/Open
Ringkasan.pdf
  Restricted Access
Ringkasan443.09 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.