Type
Text
Type
Dissertation
Advisor
Green, David | Powers, Scott | Xing, Haipeng | Zhu, Wei | Shen, Ronglai
Date
2012-12-01
Keywords
Statistics | Bayes Theory, Markov Chain, Segmentation
Department
Department of Applied Mathematics and Statistics
Language
en_US
Source
This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.
Identifier
http://hdl.handle.net/11401/71042
Publisher
The Graduate School, Stony Brook University: Stony Brook, NY.
Format
application/pdf
Abstract
DNA copy number change and epigenetic alteration often induce abnormal RNA expression level and have been linked to the development and progression of cancer. While various methods have been proposed for studying microarray DNA copy number and RNA expression data respectively, little statistical work has been done in modeling the relationship between the two. We propose for the joint analysis of the two types of data a new stochastic change-point model with latent variables, and an associated estimation procedure. Our method integrates hidden Markov model with Bayesian statistics to yield joint posterior distribution of DNA and RNA signal intensities throughout the whole genome. Explicit formulas of the posterior means are derived, which can be used to give direct estimates of the signal intensities without performing segmentation. A subsequent segmentation procedure is further provided to identify change-points and yield piecewise constant estimates of the signal intensities on each segment. Other quantities can also be derived from the posterior distribution for assessing the confidence of coincident and non-coincident change-points in the DNA and RNA sequences. Based on these estimates, chromosomal regions with genetic and potential epigenetic aberrations can be identified. For computational simplicity we propose an approximation method to keep computation time linear in sequence length, hence the method can be readily applied to the new generation of higher-throughput arrays. The proposed method is illustrated through simulation studies and application to a real data set. | 68 pages
Recommended Citation
Su, Yi, "A Stochastic Segmentation Model for Joint DNA-RNA Microarray Data Analysis" (2012). Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions). 249.
https://commons.library.stonybrook.edu/stony-brook-theses-and-dissertations-collection/249