A fast algorithm for genome-wide haplotype pattern mining

被引:2
作者
Besenbacher, Soren [1 ,2 ]
Pedersen, Christian N. S. [1 ,2 ]
Mailund, Thomas [1 ]
机构
[1] Univ Aarhus, Bioinformat Res Ctr, DK-8000 Aarhus C, Denmark
[2] Univ Aarhus, Dept Comp Sci, DK-8000 Aarhus C, Denmark
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
PROSTATE-CANCER; ASSOCIATION; LOCI; VARIANT; POWER;
D O I
10.1186/1471-2105-10-S1-S74
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identifying the genetic components of common diseases has long been an important area of research. Recently, genotyping technology has reached the level where it is cost effective to genotype single nucleotide polymorphism (SNP) markers covering the entire genome, in thousands of individuals, and analyse such data for markers associated with a diseases. The statistical power to detect association, however, is limited when markers are analysed one at a time. This can be alleviated by considering multiple markers simultaneously. The Haplotype Pattern Mining (HPM) method is a machine learning approach to do exactly this. Results: We present a new, faster algorithm for the HPM method. The new approach use patterns of haplotype diversity in the genome: locally in the genome, the number of observed haplotypes is much smaller than the total number of possible haplotypes. We show that the new approach speeds up the HPM method with a factor of 2 on a genome-wide dataset with 5009 individuals typed in 491208 markers using default parameters and more if the pattern length is increased. Conclusion: The new algorithm speeds up the HPM method and we show that it is feasible to apply HPM to whole genome association mapping with thousands of individuals and hundreds of thousands of markers.
引用
收藏
页数:8
相关论文
共 22 条
[1]   A common variant associated with prostate cancer in European and African populations [J].
Amundadottir, Laufey T. ;
Sulem, Patrick ;
Gudmundsson, Julius ;
Helgason, Agnar ;
Baker, Adam ;
Agnarsson, Bjarni A. ;
Sigurdsson, Asgeir ;
Benediktsdottir, Kristrun R. ;
Cazier, Jean-Baptiste ;
Sainz, Jesus ;
Jakobsdottir, Margret ;
Kostic, Jelena ;
Magnusdottir, Droplaug N. ;
Ghosh, Shyamali ;
Agnarsson, Kari ;
Birgisdottir, Birgitta ;
Le Roux, Louise ;
Olafsdottir, Adalheidur ;
Blondal, Thorarinn ;
Andresdottir, Margret ;
Gretarsdottir, Olafia Svandis ;
Bergthorsson, Jon T. ;
Gudbjartsson, Daniel ;
Gylfason, Arnaldur ;
Thorleifsson, Gudmar ;
Manolescu, Andrei ;
Kristjansson, Kristleifur ;
Geirsson, Gudmundur ;
Isaksson, Helgi ;
Douglas, Julie ;
Johansson, Jan-Erik ;
Balter, Katarina ;
Wiklund, Fredrik ;
Montie, James E. ;
Yu, Xiaoying ;
Suarez, Brian K. ;
Ober, Carole ;
Cooney, Kathleen A. ;
Gronberg, Henrik ;
Catalona, William J. ;
Einarsson, Gudmundur V. ;
Barkardottir, Rosa B. ;
Gulcher, Jeffrey R. ;
Kong, Augustine ;
Thorsteinsdottir, Unnur ;
Stefansson, Kari .
NATURE GENETICS, 2006, 38 (06) :652-658
[2]   A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization [J].
Arking, Dan E. ;
Pfeufer, Arne ;
Post, Wendy ;
Kao, W. H. Linda ;
Newton-Cheh, Christopher ;
Ikeda, Morna ;
West, Kristen ;
Kashuk, Carl ;
Akyol, Mahmut ;
Perz, Siegfried ;
Jalilzadeh, Shapour ;
Illig, Thomas ;
Gieger, Christian ;
Guo, Chao-Yu ;
Larson, Martin G. ;
Wichmann, H. Erich ;
Marban, Eduardo ;
O'Donnell, Christopher J. ;
Hirschhorn, Joel N. ;
Kaeaeb, Stefan ;
Spooner, Peter M. ;
Meitinger, Thomas ;
Chakravarti, Aravinda .
NATURE GENETICS, 2006, 38 (06) :644-651
[3]   Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering [J].
Browning, Sharon R. ;
Browning, Brian L. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) :1084-1097
[4]   Multilocus association mapping using variable-length Markov chains [J].
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 78 (06) :903-913
[5]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[6]   Efficiency and power in genetic association studies [J].
de Bakker, PIW ;
Yelensky, R ;
Pe'er, I ;
Gabriel, SB ;
Daly, MJ ;
Altshuler, D .
NATURE GENETICS, 2005, 37 (11) :1217-1223
[7]   Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes [J].
Gudmundsson, Julius ;
Sulem, Patrick ;
Steinthorsdottir, Valgerdur ;
Bergthorsson, Jon T. ;
Thorleifsson, Gudmar ;
Manolescu, Andrei ;
Rafnar, Thorunn ;
Gudbjartsson, Daniel ;
Agnarsson, Bjarni A. ;
Baker, Adam ;
Sigurdsson, Asgeir ;
Benediktsdottir, Kristrun R. ;
Jakobsdottir, Margret ;
Blondal, Thorarinn ;
Stacey, Simon N. ;
Helgason, Agnar ;
Gunnarsdottir, Steinunn ;
Olafsdottir, Adalheidur ;
Kristinsson, Kari T. ;
Birgisdottir, Birgitta ;
Ghosh, Shyamali ;
Thorlacius, Steinunn ;
Magnusdottir, Dana ;
Stefansdottir, Gerdur ;
Kristjansson, Kristleifur ;
Bagger, Yu ;
Wilensky, Robert L. ;
Reilly, Muredach P. ;
Morris, Andrew D. ;
Kimber, Charlotte H. ;
Adeyemo, Adebowale ;
Chen, Yuanxiu ;
Zhou, Jie ;
So, Wing-Yee ;
Tong, Peter C. Y. ;
Ng, Maggie C. Y. ;
Hansen, Torben ;
Andersen, Gitte ;
Borch-Johnsen, Knut ;
Jorgensen, Torben ;
Tres, Alejandro ;
Fuertes, Fernando ;
Ruiz-Echarri, Manuel ;
Asin, Laura ;
Saez, Berta ;
van Boven, Erica ;
Klaver, Siem ;
Swinkels, Dorine W. ;
Aben, Katja K. ;
Graif, Theresa .
NATURE GENETICS, 2007, 39 (08) :977-983
[8]   Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24 [J].
Gudmundsson, Julius ;
Sulem, Patrick ;
Manolescu, Andrei ;
Amundadottir, Laufey T. ;
Gudbjartsson, Daniel ;
Helgason, Agnar ;
Rafnar, Thorunn ;
Bergthorsson, Jon T. ;
Agnarsson, Bjarni A. ;
Baker, Adam ;
Sigurdsson, Asgeir ;
Benediktsdottir, Kristrun R. ;
Jakobsdottir, Margret ;
Xu, Jianfeng ;
Blondal, Thorarinn ;
Kostic, Jelena ;
Sun, Jielin ;
Ghosh, Shyamali ;
Stacey, Simon N. ;
Mouy, Magali ;
Saemundsdottir, Jona ;
Backman, Valgerdur M. ;
Kristjansson, Kristleifur ;
Tres, Alejandro ;
Partin, Alan W. ;
Albers-Akkers, Marjo T. ;
Marcos, Javier Godino-Ivan ;
Walsh, Patrick C. ;
Swinkels, Dorine W. ;
Navarrete, Sebastian ;
Isaacs, Sarah D. ;
Aben, Katja K. ;
Graif, Theresa ;
Cashy, John ;
Ruiz-Echarri, Manuel ;
Wiley, Kathleen E. ;
Suarez, Brian K. ;
Witjes, J. Alfred ;
Frigge, Mike ;
Ober, Carole ;
Jonsson, Eirikur ;
Einarsson, Gudmundur V. ;
Mayordomo, Jose I. ;
Kiemeney, Lambertus A. ;
Isaacs, William B. ;
Catalona, William J. ;
Barkardottir, Rosa B. ;
Gulcher, Jeffrey R. ;
Thorsteinsdottir, Unnur ;
Kong, Augustine .
NATURE GENETICS, 2007, 39 (05) :631-637
[9]   Gene mapping via the ancestral recombination graph [J].
Larribe, F ;
Lessard, S .
THEORETICAL POPULATION BIOLOGY, 2002, 62 (02) :215-229
[10]   Haplotype-based linkage disequilibrium mapping via direct data mining [J].
Li, J ;
Jiang, T .
BIOINFORMATICS, 2005, 21 (24) :4384-4393