A Markov chain Monte Carlo technique for identification of combinations of allelic variants underlying complex diseases in humans

被引:79
作者
Favorov, AV [1 ]
Andreewski, TV
Sudomoina, MA
Favorova, OO
Parmigiani, G
Ochs, MF
机构
[1] GosNIIGenet, Bioinformat Lab, Moscow 117545, Russia
[2] Russian State Med Univ, Dept Mol Biol & Med Biotechnol, Moscow 121552, Russia
[3] Cardiol Res Ctr, Moscow 121552, Russia
[4] Johns Hopkins Univ, Dept Oncol, Baltimore, MD 21205 USA
[5] Johns Hopkins Univ, Dept Biostat, Baltimore, MD 21205 USA
[6] Johns Hopkins Univ, Dept Pathol, Baltimore, MD 21205 USA
[7] Fox Chase Canc Ctr, Philadelphia, PA 19111 USA
基金
美国国家科学基金会;
关键词
D O I
10.1534/genetics.105.048090
中图分类号
Q3 [遗传学];
学科分类号
071007 [遗传学]; 090102 [作物遗传育种];
摘要
In recent years, the number of studies focusing on the genetic basis of common disorders with a complex mode of inheritance, in which multiple genes of small effect are involved, has been steadily increasing. An improved methodology to identify the cumulative contribution of several polymorphous genes would accelerate our understanding of their importance in disease susceptibility and our ability to develop new treatments. A critical bottleneck is the inability of standard statistical approaches, developed for relatively modest predictor sets, to achieve power in the face of the enormous growth in our knowledge of genomics. The inability is due to the combinatorial complexity arising in searches for multiple interacting genes. Similar "curse of dimensionality" problems have arisen in other fields, and Bayesian statistical approaches coupled to Markov chain Monte Carlo (MCMC) techniques have led to significant improvements in understanding. We present here an algorithm, APSampler, for the exploration of potential combinations of allelic variations positively or negatively associated with a disease or with a phenotype. The algorithm relies on the rank comparison of phenotype for individuals with and without specific patterns (i.e., combinations of allelic variants) isolated in genetic backgrounds matched for the remaining significant patterns. It constructs a Markov chain to sample only potentially significant variants, minimizing the potential of large data sets to overwhelm the search. We tested APSampler on a simulated data set and on a case-control MS (multiple sclerosis) study for ethnic Russians. For the simulated data, the algorithm identified all the phenotype-associated allele combinations coded into the data and, for the MS data, it replicated the previously known findings.
引用
收藏
页码:2113 / 2121
页数:9
相关论文
共 30 条
[1]
The Bayesian revolution in genetics [J].
Beaumont, MA ;
Rannala, B .
NATURE REVIEWS GENETICS, 2004, 5 (04) :251-261
[2]
CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]
Association and linkage of juvenile MS with HLA-DR2(15) in Russians [J].
Boiko, AN ;
Gusev, EI ;
Sudomoina, MA ;
Alekseenkov, AD ;
Kulakova, OG ;
Bikova, OV ;
Maslova, OI ;
Guseva, MR ;
Boiko, SY ;
Guseva, ME ;
Favorova, OO .
NEUROLOGY, 2002, 58 (04) :658-660
[4]
Multiple shrinkage and subset selection in wavelets [J].
Clyde, M ;
Parmigiani, G ;
Vidakovic, B .
BIOMETRIKA, 1998, 85 (02) :391-401
[5]
DIBUCCHIANICO A, 1996, 9624 COSOR EIDH U TE
[6]
Dinneen L. C., 1973, Applied Statistics, V22, P269, DOI 10.2307/2346934
[7]
Empirical Bayes methods and false discovery rates for microarrays [J].
Efron, B ;
Tibshirani, R .
GENETIC EPIDEMIOLOGY, 2002, 23 (01) :70-86
[8]
The chemokine receptor CCR5 deletion mutation is associated with MS in HLA-DR4-positive Russians [J].
Favorova, OO ;
Andreewski, TV ;
Boiko, AN ;
Sudomoina, MA ;
Alekseenkov, AD ;
Kulakova, OG ;
Slanova, AV ;
Gusev, EI .
NEUROLOGY, 2002, 59 (10) :1652-1655
[9]
A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[10]
VARIABLE SELECTION VIA GIBBS SAMPLING [J].
GEORGE, EI ;
MCCULLOCH, RE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (423) :881-889