The functional spectrum of low-frequency coding variation

被引:155
作者
Marth, Gabor T. [1 ]
Yu, Fuli [2 ]
Indap, Amit R. [1 ]
Garimella, Kiran [3 ]
Gravel, Simon [4 ]
Leong, Wen Fung [1 ]
Tyler-Smith, Chris [5 ]
Bainbridge, Matthew [2 ]
Blackwell, Tom [6 ]
Zheng-Bradley, Xiangqun [7 ]
Chen, Yuan [5 ]
Challis, Danny [2 ]
Clarke, Laura [7 ]
Ball, Edward V. [8 ]
Cibulskis, Kristian [3 ]
Cooper, David N. [8 ]
Fulton, Bob [9 ]
Hartl, Chris [3 ]
Koboldt, Dan [9 ]
Muzny, Donna [4 ]
Smith, Richard [7 ]
Sougnez, Carrie [3 ]
Stewart, Chip [1 ]
Ward, Alistair [1 ]
Yu, Jin [2 ]
Xue, Yali [5 ]
Altshuler, David [3 ]
Bustamante, Carlos D. [4 ]
Clark, Andrew G. [10 ]
Daly, Mark [3 ]
DePristo, Mark [3 ]
Flicek, Paul [7 ]
Gabriel, Stacey [3 ]
Mardis, Elaine [9 ]
Palotie, Aarno [5 ]
Gibbs, Richard [2 ]
机构
[1] Boston Coll, Dept Biol, Chestnut Hill, MA 02467 USA
[2] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[3] 7 Cambridge Ctr, Broad Inst, Populat Genom Program, Cambridge, MA 02142 USA
[4] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[5] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[6] Univ Michigan, Sch Publ Hlth, Ann Arbor, MI 48109 USA
[7] European Bioinformat Inst, Cambridge CB10 1SD, England
[8] Cardiff Univ, Sch Med, Inst Med Genet, Cardiff CF14 4XN, S Glam, Wales
[9] Washington Univ, Sch Med, Genome Inst, St Louis, MO 63108 USA
[10] Cornell Univ, Dept Mol Biol & Genet, Ithaca, NY 14853 USA
来源
GENOME BIOLOGY | 2011年 / 12卷 / 09期
基金
美国国家卫生研究院; 英国惠康基金;
关键词
POLYMORPHISM; GENERATION; DISCOVERY;
D O I
10.1186/gb-2011-12-9-r84
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency. Results: The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants. Conclusions: This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variation.
引用
收藏
页数:17
相关论文
共 26 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[3]   Integrating common and rare genetic variation in diverse human populations [J].
Altshuler, David M. ;
Gibbs, Richard A. ;
Peltonen, Leena ;
Dermitzakis, Emmanouil ;
Schaffner, Stephen F. ;
Yu, Fuli ;
Bonnen, Penelope E. ;
de Bakker, Paul I. W. ;
Deloukas, Panos ;
Gabriel, Stacey B. ;
Gwilliam, Rhian ;
Hunt, Sarah ;
Inouye, Michael ;
Jia, Xiaoming ;
Palotie, Aarno ;
Parkin, Melissa ;
Whittaker, Pamela ;
Chang, Kyle ;
Hawes, Alicia ;
Lewis, Lora R. ;
Ren, Yanru ;
Wheeler, David ;
Muzny, Donna Marie ;
Barnes, Chris ;
Darvishi, Katayoon ;
Hurles, Matthew ;
Korn, Joshua M. ;
Kristiansson, Kati ;
Lee, Charles ;
McCarroll, Steven A. ;
Nemesh, James ;
Keinan, Alon ;
Montgomery, Stephen B. ;
Pollack, Samuela ;
Price, Alkes L. ;
Soranzo, Nicole ;
Gonzaga-Jauregui, Claudia ;
Anttila, Verneri ;
Brodeur, Wendy ;
Daly, Mark J. ;
Leslie, Stephen ;
McVean, Gil ;
Moutsianas, Loukas ;
Nguyen, Huy ;
Zhang, Qingrun ;
Ghori, Mohammed J. R. ;
McGinnis, Ralph ;
McLaren, William ;
Takeuchi, Fumihiko ;
Grossman, Sharon R. .
NATURE, 2010, 467 (7311) :52-58
[4]  
[Anonymous], 1000-genomes-project @ www.genome.gov
[5]   Mendelian disorders and multifactorial traits: the big divide or one for all? [J].
Antonarakis, Stylianos E. ;
Chakravarti, Aravinda ;
Cohen, Jonathan C. ;
Hardy, John .
NATURE REVIEWS GENETICS, 2010, 11 (05) :380-384
[6]   Assessing the evolutionary impact of amino acid mutations in the human genome [J].
Boyko, Adam R. ;
Williamson, Scott H. ;
Indap, Amit R. ;
Degenhardt, Jeremiah D. ;
Hernandez, Ryan D. ;
Lohmueller, Kirk E. ;
Adams, Mark D. ;
Schmidt, Steffen ;
Sninsky, John J. ;
Sunyaev, Shamil R. ;
White, Thomas J. ;
Nielsen, Rasmus ;
Clark, Andrew G. ;
Bustamante, Carlos D. .
PLOS GENETICS, 2008, 4 (05)
[7]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[8]   Deep resequencing reveals excess rare recent variants consistent with explosive population growth [J].
Coventry, Alex ;
Bull-Otterson, Lara M. ;
Liu, Xiaoming ;
Clark, Andrew G. ;
Maxwell, Taylor J. ;
Crosby, Jacy ;
Hixson, James E. ;
Rea, Thomas J. ;
Muzny, Donna M. ;
Lewis, Lora R. ;
Wheeler, David A. ;
Sabo, Aniko ;
Lusk, Christine ;
Weiss, Kenneth G. ;
Akbar, Humeira ;
Cree, Andrew ;
Hawes, Alicia C. ;
Newsham, Irene ;
Varghese, Robin T. ;
Villasana, Donna ;
Gross, Shannon ;
Joshi, Vandita ;
Santibanez, Jireh ;
Morgan, Margaret ;
Chang, Kyle ;
Hale, Walker ;
Templeton, Alan R. ;
Boerwinkle, Eric ;
Gibbs, Richard ;
Sing, Charles F. .
NATURE COMMUNICATIONS, 2010, 1
[9]   A framework for variation discovery and genotyping using next-generation DNA sequencing data [J].
DePristo, Mark A. ;
Banks, Eric ;
Poplin, Ryan ;
Garimella, Kiran V. ;
Maguire, Jared R. ;
Hartl, Christopher ;
Philippakis, Anthony A. ;
del Angel, Guillermo ;
Rivas, Manuel A. ;
Hanna, Matt ;
McKenna, Aaron ;
Fennell, Tim J. ;
Kernytsky, Andrew M. ;
Sivachenko, Andrey Y. ;
Cibulskis, Kristian ;
Gabriel, Stacey B. ;
Altshuler, David ;
Daly, Mark J. .
NATURE GENETICS, 2011, 43 (05) :491-+
[10]   STATISTICAL PROPERTIES OF SEGREGATING SITES [J].
FU, YX .
THEORETICAL POPULATION BIOLOGY, 1995, 48 (02) :172-197