Authors

Jinmiao Fu

Type

Text

Type

Dissertation

Advisor

Wu, Song | Zhu, Wei | Wang, Xuefeng | Kotov, Roman.

Date

2015-12-01

Keywords

Statistics

Department

Department of Applied Mathematics and Statistics.

Language

en_US

Source

This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.

Identifier

http://hdl.handle.net/11401/77610

Publisher

The Graduate School, Stony Brook University: Stony Brook, NY.

Format

application/pdf

Abstract

With the rapid advancement of biotechnology, multiple measurement platforms of microbiome abundance are increasingly available. These include the traditional platforms of gene microarray and quantitative PCR, as well as the modern next-generation sequencing technique. Consequently, the evaluation of the consistencies of these platforms has also become an increasingly crucial topic. Classic methods including using the Pearson correlation or the more suitable errors-in-variables (EIV) models to gauge the linear dependency between two platforms. Our group is among the leaders in applying the structural equation modeling (SEM) to estimate the relationships among three or more platforms and to combine these measurements for an optimal joint analysis. However, our previous work, as well as those of the others, only examines the agreement for each individual bacterium. In this thesis, we have developed a novel random coefficient SEM model to determine the agreement of different platforms across the entire microbiomes together taking into account the heterogeneity of individual bacterium. We further applied this novel platform comparison method to a 16S ribosomal RNA sequencing study on bacteria abundance with three measurement modalities referred to as the V1V2, V1V3 and V3V4 windows. These are indeed three different targeting regions of primers when generating the amplicons. The newly developed SEM method with random loadings aims to test the average overall and pairwise consistency among these three platforms. Subsequently, good agreement between V1V2 and V3V4, and between V1V3 and V3V4 is found, while more discrepancy between V1V2 and V1V3 is detected. Moreover, the prediction of random loadings, a by-product of the model above, is able to elucidate the performance of platforms on each individual bacterium. The paradigm mentioned above could be easily adjusted to situations where only two platforms are available, which is another contribution of this work. Errors-in-variables (EIV) model with random coefficients (loadings) is proposed for the given task. To further confirm the conclusions above, pairwise comparison is performed and we are glad to report that coherent results are obtained. | 95 pages

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.