Improving the Accuracy and Efficiency of Identity-by-Descent Detection in Population Data

被引:440
作者
Browning, Brian L. [1 ]
Browning, Sharon R. [2 ]
机构
[1] Univ Washington, Dept Med, Div Med Genet, Seattle, WA 98195 USA
[2] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
来源
GENETICS | 2013年 / 194卷 / 02期
基金
美国国家卫生研究院; 英国惠康基金;
关键词
WHOLE-GENOME ASSOCIATION; GENOTYPE DATA; LINKAGE DISEQUILIBRIUM; POSITIVE SELECTION; BIRTH COHORT; VARIANTS; HERITABILITY; INDIVIDUALS; HAPLOTYPES; IMPUTATION;
D O I
10.1534/genetics.113.150029
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Segments of indentity-by-descent (IBD) detected from high-density genetic data are useful for many applications, including long-range phase determination, phasing family data, imputation, IBD mapping, and heritability analysis in founder populations. We present Refined IBD, a new method for IBD segment detection. Refined IBD achieves both computational efficiency and highly accurate IBD segment reporting by searching for IBD in two steps. The first step (identification) uses the GERMLINE algorithm to find shared haplotypes exceeding a length threshold. The second step (refinement) evaluates candidate segments with a probabilistic approach to assess the evidence for IBD. Like GERMLINE, Refined IBD allows for IBD reporting on a haplotype level, which facilitates determination of multi-individual IBD and allows for haplotype-based downstream analyses. To investigate the properties of Refined IBD, we simulate SNP data from a model with recent superexponential population growth that is designed to match United Kingdom data. The simulation results show that Refined IBD achieves a better power/accuracy profile than fastIBD or GERMLINE. We find that a single run of Refined IBD achieves greater power than 10 runs of fastIBD. We also apply Refined IBD to SNP data for samples from the United Kingdom and from Northern Finland and describe the IBD sharing in these data sets. Refined IBD is powerful, highly accurate, and easy to use and is implemented in Beagle version 4.
引用
收藏
页码:459 / +
页数:16
相关论文
共 44 条
[1]   Relatedness Mapping and Tracts of Relatedness for Genome-Wide Data in the Presence of Linkage Disequilibrium [J].
Albrechtsen, Anders ;
Korneliussen, Thorfinn Sand ;
Moltke, Ida ;
Hansen, Thomas van Overseem ;
Nielsen, Finn Cilius ;
Nielsen, Rasmus .
GENETIC EPIDEMIOLOGY, 2009, 33 (03) :266-274
[2]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[3]  
[Anonymous], 1972, INEQUALITIES
[4]  
Bacci Massimo Livi, 2000, POPULATION EUROPE
[5]   Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A region [J].
Barrett, Jeffrey C. ;
Lee, James C. ;
Lees, Charles W. ;
Prescott, Natalie J. ;
Anderson, Carl A. ;
Phillips, Anne ;
Wesley, Emma ;
Parnell, Kirstie ;
Zhang, Hu ;
Drummond, Hazel ;
Nimmo, Elaine R. ;
Massey, Dunecan ;
Blaszczyk, Kasia ;
Elliott, Timothy ;
Cotterill, Lynn ;
Dallal, Helen ;
Lobo, Alan J. ;
Mowat, Craig ;
Sanderson, Jeremy D. ;
Jewell, Derek P. ;
Newman, William G. ;
Edwards, Cathryn ;
Ahmad, Tariq ;
Mansfield, John C. ;
Satsangi, Jack ;
Parkes, Miles ;
Mathew, Christopher G. ;
Donnelly, Peter ;
Peltonen, Leena ;
Blackwell, Jenefer M. ;
Bramon, Elvira ;
Brown, Matthew A. ;
Casas, Juan P. ;
Corvin, Aiden ;
Craddock, Nicholas ;
Deloukas, Panos ;
Duncanson, Audrey ;
Jankowski, Janusz ;
Markus, Hugh S. ;
McCarthy, Mark I. ;
Palmer, Colin N. A. ;
Plomin, Robert ;
Rautanen, Anna ;
Sawcer, Stephen J. ;
Samani, Nilesh ;
Trembath, Richard C. ;
Viswanathan, Ananth C. ;
Wood, Nicholas ;
Spencer, Chris C. A. ;
Bellenguez, Celine .
NATURE GENETICS, 2009, 41 (12) :1330-U99
[6]   Inferring Coancestry in Population Samples in the Presence of Linkage Disequilibrium [J].
Brown, M. D. ;
Glazner, C. G. ;
Zheng, C. ;
Thompson, E. A. .
GENETICS, 2012, 190 (04) :1447-+
[7]   Efficient multilocus association testing for whole genome association studies using localized haplotype clustering [J].
Browning, Brian L. ;
Browning, Sharon R. .
GENETIC EPIDEMIOLOGY, 2007, 31 (05) :365-375
[8]   A Fast, Powerful Method for Detecting Identity by Descent [J].
Browning, Brian L. ;
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2011, 88 (02) :173-182
[9]   A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals [J].
Browning, Brian L. ;
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 84 (02) :210-223
[10]   Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering [J].
Browning, Sharon R. ;
Browning, Brian L. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) :1084-1097