Authors

Yi Su

Type

Text

Type

Dissertation

Advisor

Green, David | Powers, Scott | Xing, Haipeng | Zhu, Wei | Shen, Ronglai

Date

2012-12-01

Keywords

Statistics | Bayes Theory, Markov Chain, Segmentation

Department

Department of Applied Mathematics and Statistics

Language

en_US

Source

This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.

Identifier

http://hdl.handle.net/11401/71042

Publisher

The Graduate School, Stony Brook University: Stony Brook, NY.

Format

application/pdf

Abstract

DNA copy number change and epigenetic alteration often induce abnormal RNA expression level and have been linked to the development and progression of cancer. While various methods have been proposed for studying microarray DNA copy number and RNA expression data respectively, little statistical work has been done in modeling the relationship between the two. We propose for the joint analysis of the two types of data a new stochastic change-point model with latent variables, and an associated estimation procedure. Our method integrates hidden Markov model with Bayesian statistics to yield joint posterior distribution of DNA and RNA signal intensities throughout the whole genome. Explicit formulas of the posterior means are derived, which can be used to give direct estimates of the signal intensities without performing segmentation. A subsequent segmentation procedure is further provided to identify change-points and yield piecewise constant estimates of the signal intensities on each segment. Other quantities can also be derived from the posterior distribution for assessing the confidence of coincident and non-coincident change-points in the DNA and RNA sequences. Based on these estimates, chromosomal regions with genetic and potential epigenetic aberrations can be identified. For computational simplicity we propose an approximation method to keep computation time linear in sequence length, hence the method can be readily applied to the new generation of higher-throughput arrays. The proposed method is illustrated through simulation studies and application to a real data set. | 68 pages

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.