Low-coverage sequencing: Implications for design of complex trait association studies

被引:225
作者
Li, Yun [1 ,2 ]
Sidore, Carlo [1 ,3 ,4 ]
Kang, Hyun Min [1 ]
Boehnke, Michael [1 ]
Abecasis, Goncalo R. [1 ]
机构
[1] Univ Michigan, Sch Publ Hlth, Dept Biostat, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[2] Univ N Carolina, Dept Biostat, Dept Genet, Chapel Hill, NC 27599 USA
[3] CNR, Ist Neurogenet & Neurofarmacol, I-09042 Cagliari, Italy
[4] Univ Sassari, Dipartimento Sci Biomed, I-07100 Sassari, Italy
关键词
GENOME-WIDE ASSOCIATION; MISSING HERITABILITY; LOCI; HAPLOTYPE; IMPUTATION; EFFICIENT; GENOTYPES; DISEASES; SNPS; MAP;
D O I
10.1101/gr.117259.110
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
New sequencing technologies allow genomic variation to be surveyed in much greater detail than previously possible. While detailed analysis of a single individual typically requires deep sequencing, when many individuals are sequenced it is possible to combine shallow sequence data across individuals to generate accurate calls in shared stretches of chromosome. Here, we show that, as progressively larger numbers of individuals are sequenced, increasingly accurate genotype calls can be generated for a given sequence depth. We evaluate the implications of low-coverage sequencing for complex trait association studies. We systematically compare study designs based on genotyping of tagSNPs, sequencing of many individuals at depths ranging between 2x and 30x, and imputation of variants discovered by sequencing a subset of individuals into the remainder of the sample. We show that sequencing many individuals at low depth is an attractive strategy for studies of complex trait genetics. For example, for disease-associated variants with frequency >0.2%, sequencing 3000 individuals at 4x depth provides similar power to deep sequencing of >2000 individuals at 30x depth but requires only similar to 20% of the sequencing effort. We also show low-coverage sequencing can be used to build a reference panel that can drive imputation into additional samples to increase power further. We provide guidance for investigators wishing to combine results from sequenced, genotyped, and imputed samples.
引用
收藏
页码:940 / 951
页数:12
相关论文
共 39 条
[1]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[2]   Accurate detection and genotyping of SNPs utilizing population sequencing data [J].
Bansal, Vikas ;
Harismendy, Olivier ;
Tewhey, Ryan ;
Murray, Sarah S. ;
Schork, Nicholas J. ;
Topol, Eric J. ;
Frazer, Kelly A. .
GENOME RESEARCH, 2010, 20 (04) :537-545
[3]  
Baum L. E., 1972, Inequalities, V3, P1
[4]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[5]   Whole-genome re-sequencing [J].
Bentley, David R. .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2006, 16 (06) :545-552
[6]   Simultaneous Genotype Calling and Haplotype Phasing Improves Genotype Accuracy and Reduces False-Positive Associations for Genome-wide Association Studies [J].
Browning, Brian L. ;
Yu, Zhaoxia .
AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 85 (06) :847-861
[7]   New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk [J].
Dupuis, Josee ;
Langenberg, Claudia ;
Prokopenko, Inga ;
Saxena, Richa ;
Soranzo, Nicole ;
Jackson, Anne U. ;
Wheeler, Eleanor ;
Glazer, Nicole L. ;
Bouatia-Naji, Nabila ;
Gloyn, Anna L. ;
Lindgren, Cecilia M. ;
Magi, Reedik ;
Morris, Andrew P. ;
Randall, Joshua ;
Johnson, Toby ;
Elliott, Paul ;
Rybin, Denis ;
Thorleifsson, Gudmar ;
Steinthorsdottir, Valgerdur ;
Henneman, Peter ;
Grallert, Harald ;
Dehghan, Abbas ;
Hottenga, Jouke Jan ;
Franklin, Christopher S. ;
Navarro, Pau ;
Song, Kijoung ;
Goel, Anuj ;
Perry, John R. B. ;
Egan, Josephine M. ;
Lajunen, Taina ;
Grarup, Niels ;
Sparso, Thomas ;
Doney, Alex ;
Voight, Benjamin F. ;
Stringham, Heather M. ;
Li, Man ;
Kanoni, Stavroula ;
Shrader, Peter ;
Cavalcanti-Proenca, Christine ;
Kumari, Meena ;
Qi, Lu ;
Timpson, Nicholas J. ;
Gieger, Christian ;
Zabena, Carina ;
Rocheleau, Ghislain ;
Ingelsson, Erik ;
An, Ping ;
O'Connell, Jeffrey ;
Luan, Jian'an ;
Elliott, Amanda .
NATURE GENETICS, 2010, 42 (02) :105-U32
[8]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[9]   Potential etiologic and functional implications of genome-wide association loci for human diseases and traits [J].
Hindorff, Lucia A. ;
Sethupathy, Praveen ;
Junkins, Heather A. ;
Ramos, Erin M. ;
Mehta, Jayashri P. ;
Collins, Francis S. ;
Manolio, Teri A. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (23) :9362-9367
[10]  
HUDSON RR, 1991, OXF SURV EVOL BIOL, V7, P1