Authors

Ruixue Wang

Type

Text

Type

Dissertation

Advisor

Finch, Stephen J, Mendell, Nancy | Wu, Song | Gordon, Derek.

Date

2012-08-01

Keywords

Statistics

Department

Department of Applied Mathematics and Statistics

Language

en_US

Source

This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.

Identifier

http://hdl.handle.net/11401/71432

Publisher

The Graduate School, Stony Brook University: Stony Brook, NY.

Format

application/pdf

Abstract

Growth mixture modeling (GMM) is used to detect the existence of two or more trajectory patterns among participants in a longitudinal study. One crucial issue is the determination of the number of longitudinal trajectory patterns. I study the properties of three statistics used to identify the number of components in a sample of data. These are the Bayesian information criterion (BIC), Lo-Mendell-Rubin test (LMRT), and bootstrap likelihood ratio test (BLRT). I estimate the probability that each of these statistics identifies that there is a single component for homogeneous data using the M-plus and SAS PROC TRAJ statistical packages. I use four distributions for the longitudinal outcome measures: the censored normal distribution, the gamma distribution, the zero-inflated Poisson distribution and the Bernoulli distribution. I considered these factors: trajectory pattern, intra-class correlation, time measurements, random effects and sample size. For the censored normal distribution, the BIC and LMRT (set at the 0.01 significance level) have the highest fraction of replicates identified as homogeneous. These rates for LMRT are 0.92 or better at significance level 0.01 and 0.98 or better for the BIC. The identification rates of these two statistics are not significantly affected by the intra-class correlation in the trajectory, the trajectory pattern, the number of time measurements, and the sample size. A similar pattern was observed for the gamma distribution using the M-plus statistical package. The identification rate of the LMRT is better than that of the BLRT at both the 0.01 and 0.05 significance levels. For the ZIP and Bernoulli distribution, PROC TRAJ computations have a higher correct identification rate than those from M-plus. Larger sample size is associated with an increase in the probability that two or more components will be identified for ZIP distributed data following a linear trend and with random effects. The same pattern holds for Bernoulli data. Overall, the BIC statistic has the highest correct identification rate. These rates are on the order of 95% for homogeneous data following either a censored normal or gamma distribution. | 75 pages

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.