Generalized genomic distance-based regression methodology for multilocus association analysis

被引:131
作者
Wessel, Jennifer
Schork, Nicholas J.
机构
[1] Univ Calif San Diego, Dept Psychiat, Polymorphism Res Lab, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Div Epidemiol, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Div Biostat, Dept Family & Prevent Med, Moores Canc Ctr, La Jolla, CA 92093 USA
[4] Univ Calif San Diego, Ctr Human Genet & Genom, Calif Inst Telecommun & Informat Technol, La Jolla, CA 92093 USA
[5] Univ Calif San Diego, San Diego Supercomp Ctr, La Jolla, CA 92093 USA
[6] San Diego State Univ, Grad Program Publ Hlth, San Diego, CA 92182 USA
关键词
D O I
10.1086/508346
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Large-scale, multilocus genetic association studies require powerful and appropriate statistical-analysis tools that are designed to relate genotype and haplotype information to phenotypes of interest. Many analysis approaches consider relating allelic, haplotypic, or genotypic information to a trait through use of extensions of traditional analysis techniques, such as contingency-table analysis, regression methods, and analysis-of-variance techniques. In this work, we consider a complementary approach that involves the characterization and measurement of the similarity and dissimilarity of the allelic composition of a set of individuals' diploid genomes at multiple loci in the regions of interest. We describe a regression method that can be used to relate variation in the measure of genomic dissimilarity (or "distance") among a set of individuals to variation in their trait values. Weighting factors associated with functional or evolutionary conservation information of the loci can be used in the assessment of similarity. The proposed method is very flexible and is easily extended to complex multilocus-analysis settings involving covariates. In addition, the proposed method actually encompasses both single-locus and haplotype-phylogeny analysis methods, which are two of the most widely used approaches in genetic association analysis. We showcase the method with data described in the literature. Ultimately, our method is appropriate for high-dimensional genomic data and anticipates an era when cost-effective exhaustive DNA sequence data can be obtained for a large number of individuals, over and above genotype information focused on a few well-chosen loci.
引用
收藏
页码:792 / 806
页数:15
相关论文
共 55 条
[31]   Genome-wide strategies for detecting multiple loci that influence complex diseases [J].
Marchini, J ;
Donnelly, P ;
Cardon, LR .
NATURE GENETICS, 2005, 37 (04) :413-417
[32]  
McArdle BH, 2001, ECOLOGY, V82, P290, DOI 10.1890/0012-9658(2001)082[0290:fmmtcd]2.0.co
[33]  
2
[34]   Understanding human disease mutations through the use of interspecific genetic variation [J].
Miller, MP ;
Kumar, S .
HUMAN MOLECULAR GENETICS, 2001, 10 (21) :2319-2328
[35]   Multilocus genotypes, a tree of individuals, and human evolutionary history [J].
Mountain, JL ;
CavalliSforza, LL .
AMERICAN JOURNAL OF HUMAN GENETICS, 1997, 61 (03) :705-718
[36]  
MULLER T, 2005, CLUSTER ANAL COMP DI
[37]   A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other [J].
Nyholt, DR .
AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 74 (04) :765-769
[38]   Genomic approaches to schizophrenia [J].
Owen, MJ .
CLINICAL THERAPEUTICS, 2005, 27 :S2-S7
[39]   GeneTree: comparing gene and species phylogenies using reconciled trees [J].
Page, RDM .
BIOINFORMATICS, 1998, 14 (09) :819-820
[40]   Genealogical trees, coalescent theory and the analysis of genetic polymorphisms [J].
Rosenberg, NA ;
Nordborg, M .
NATURE REVIEWS GENETICS, 2002, 3 (05) :380-390