Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions)

Application of Double Sampling to Combine Measured and Imputed Genotype Data in Genetic Association Studies

Qilong Yuan

Type

Text

Type

Dissertation

Advisor

Stephen J. Finch | Mendell, Nancy R. | Wei Zhu | Derek Gordon.

Date

2010-12-01

Keywords

Genetics -- Statistics | Double Sampling Method, Genome-wide Association Studies, Genotype Imputation, Likelihood Ratio Test Allowing for Errors

Department

Department of Applied Mathematics and Statistics

Language

en_US

Source

This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.

Identifier

http://hdl.handle.net/11401/72722

Publisher

The Graduate School, Stony Brook University: Stony Brook, NY.

Format

application/pdf

Abstract

Genotype imputation provides an essential technique for genome-wide association studies (GWAS) with hundreds of thousands of SNPs. Understanding the connection between imputation inconsistencies and the power to detect association at imputed markers or the disease genes close to them is important for the optimal design of imputation-based GWAS since genotype misclassification can significantly decrease statistical power to detect association. Double sampling of genotypes is a statistical procedure in which a portion of subjects receive a second and more precise genotyping. This paper applies the likelihood ratio test allowing for errors (LRT-AE), which incorporates double sample information for genotypes on a sub-sample of cases/controls, to correct for imputation inconsistencies. Parameters used to determine the log likelihoods are determined using the Expectation-Maximization (EM) algorithm. To compare the performance of the LRT-AE with the performance of the likelihood ratio test (LRT), which makes no adjustment for imputation inconsistencies, I perform simulation studies using a factorial design with high and low settings of: disease minor allele frequency (MAF), heterozygote relative risk, mode of inheritance (MOI), disease prevalence, and proportion of double sampled subjects. The LRT-AE method maintains correct type I error rates for all null simulations and all significance level thresholds (5%, 1%). Power improvement, however, is not significant unless more than 50% of subjects are in the double sampled group. Unbiased estimates of imputation inconsistency rates are also obtained from the LRT-AE method.

Recommended Citation

Yuan, Qilong, "Application of Double Sampling to Combine Measured and Imputed Genotype Data in Genetic Association Studies" (2010). Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions). 1925.
https://commons.library.stonybrook.edu/stony-brook-theses-and-dissertations-collection/1925

Download

COinS

Academic Commons

Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions)

Application of Double Sampling to Combine Measured and Imputed Genotype Data in Genetic Association Studies

Type

Type

Advisor

Date

Keywords

Department

Language

Source

Identifier

Publisher

Format

Abstract

Recommended Citation

Browse

Search

Author Corner

Academic Commons

Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions)

Application of Double Sampling to Combine Measured and Imputed Genotype Data in Genetic Association Studies

Authors

Type

Type

Advisor

Date

Keywords

Department

Language

Source

Identifier

Publisher

Format

Abstract

Recommended Citation

Share

Browse

Search

Author Corner