Efficient identification of identical-by-descent status in pedigrees with many untyped individuals

被引:20
作者
Li, Xin [1 ]
Yin, Xiaolin [1 ]
Li, Jing [1 ]
机构
[1] Case Western Reserve Univ, Dept Elect Engn & Comp Sci, Cleveland, OH 44106 USA
基金
美国国家卫生研究院;
关键词
LINKAGE; COEFFICIENTS; ALGORITHM; MAPS;
D O I
10.1093/bioinformatics/btq222
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Inference of identical-by-descent (IBD) probabilities is the key in family-based linkage analysis. Using high-density single nucleotide polymorphism (SNP) markers, one can almost always infer haplotype configurations of each member in a family given all individuals being typed. Consequently, the IBD status can be obtained directly from haplotype configurations. However, in reality, many family members are not typed due to practical reasons. The problem of IBD/haplotype inference is much harder when treating untyped individuals as missing. Results: We present a novel hidden Markov model (HMM) approach to infer the IBD status in a pedigree with many untyped members using high-density SNP markers. We introduce the concept of inheritance-generating function, defined for any pair of alleles in a descent graph based on a pedigree structure. We derive a recursive formula for efficient calculation of the inheritance-generating function. By aggregating all possible inheritance patterns via an explicit representation of the number and lengths of all possible paths between two alleles, the inheritance-generating function provides a convenient way to theoretically derive the transition probabilities of the HMM. We further extend the basic HMM to incorporate population linkage disequilibrium (LD). Pedigree-wise IBD sharing can be constructed based on pair-wise IBD relationships. Compared with traditional approaches for linkage analysis, our new model can efficiently infer IBD status without enumerating all possible genotypes and transmission patterns of untyped members in a family. Our approach can be reliably applied on large pedigrees with many untyped members, and the inferred IBD status can be used for non-parametric genome-wide linkage analysis.
引用
收藏
页码:i191 / i198
页数:8
相关论文
共 15 条
[1]   Handling marker-marker linkage disequilibrium: Pedigree analysis with clustered markers [J].
Abecasis, GR ;
Wigginton, JE .
AMERICAN JOURNAL OF HUMAN GENETICS, 2005, 77 (05) :754-767
[2]   Merlin-rapid analysis of dense genetic maps using sparse gene flow trees [J].
Abecasis, GR ;
Cherny, SS ;
Cookson, WO ;
Cardon, LR .
NATURE GENETICS, 2002, 30 (01) :97-101
[3]   GENERAL MODEL FOR GENETIC ANALYSIS OF PEDIGREE DATA [J].
ELSTON, RC ;
STEWART, J .
HUMAN HEREDITY, 1971, 21 (06) :523-&
[4]   Speeding up HMM algorithms for genetic linkage analysis via chain reductions of the state space [J].
Geiger, Dan ;
Meek, Christopher ;
Wexler, Ydo .
BIOINFORMATICS, 2009, 25 (12) :I196-I203
[5]   Allegro version 2 [J].
Gudbjartsson, DF ;
Thorvaldsson, T ;
Kong, A ;
Gunnarsson, G ;
Ingolfsdottir, A .
NATURE GENETICS, 2005, 37 (10) :1015-1016
[6]   A RECURSIVE ALGORITHM FOR THE CALCULATION OF IDENTITY COEFFICIENTS [J].
KARIGL, G .
ANNALS OF HUMAN GENETICS, 1981, 45 (JUL) :299-305
[7]   Calculation of IBD probabilities with dense SNP or sequence data [J].
Keith, Jonathan M. ;
McRae, Allan ;
Duffy, David ;
Mengersen, Kerrie ;
Visscher, Peter M. .
GENETIC EPIDEMIOLOGY, 2008, 32 (06) :513-519
[8]  
Kruglyak L, 1996, AM J HUM GENET, V58, P1347
[9]   CONSTRUCTION OF MULTILOCUS GENETIC-LINKAGE MAPS IN HUMANS [J].
LANDER, ES ;
GREEN, P .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (08) :2363-2367
[10]  
LI X, 2010, P S BIOC, V15, P348