Minimal haplotype tagging

被引:121
作者
Sebastiani, P
Lazarus, R
Weiss, ST
Kunkel, LM
Kohane, IS
Ramoni, MF
机构
[1] Harvard Univ, Sch Med, Childrens Hosp, Informat Program, Boston, MA 02115 USA
[2] Boston Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02118 USA
[3] Brigham & Womens Hosp, Channing Lab, Boston, MA 02115 USA
[4] Harvard Univ, Partners Ctr Genet & Genom, Boston, MA 02115 USA
[5] Howard Hughes Med Inst, Div Genet, Boston, MA 02115 USA
[6] Childrens Hosp, Informat Program, Boston, MA 02115 USA
关键词
D O I
10.1073/pnas.1633613100
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The high frequency of single-nucleotide polymorphisms (SNPs) in the human genome presents an unparalleled opportunity to track down the genetic basis of common diseases. At the same time, the sheer number of SNPs also makes unfeasible genomewide disease association studies. The haplotypic nature of the human genome, however, lends itself to the selection of a parsimonious set of SNPs, called haplotype tagging SNPs (htSNPs), able to distinguish the haplotypic variations in a population. Current approaches rely on statistical analysis of transmission rates to identify htSNPs. In contrast to these approximate methods, this contribution describes an exact, analytical, and lossless method, called BEST (Best Enumeration of SNP Tags), able to identify the minimum set of SNIPS tagging an arbitrary set of haplotypes from either pedigree or independent samples. Our results confirm that a small proportion of SNPs is sufficient to capture the haplotypic variations in a population and that this proportion decreases exponentially as the haplotype length increases. We used BEST to tag the haplotypes of 105 genes in an African-American and a European-American sample. An interesting finding of this analysis is that the vast majority (95%) of the htSNPs in the European-American sample is a subset of the htSNPs of the African-American sample. This result seems to provide further evidence that a severe bottleneck occurred during the founding of Europe and the conjectured "Out of Africa" event.
引用
收藏
页码:9900 / 9905
页数:6
相关论文
共 16 条
[1]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[2]   Shortcut around the block [J].
Casci, T .
NATURE REVIEWS GENETICS, 2002, 3 (08) :573-573
[3]   Variations on a theme: Cataloging human DNA sequence variation [J].
Collins, FS ;
Guyer, MS ;
Chakravarti, A .
SCIENCE, 1997, 278 (5343) :1580-1581
[4]   High-resolution haplotype structure in the human genome [J].
Daly, MJ ;
Rioux, JD ;
Schaffner, SE ;
Hudson, TJ ;
Lander, ES .
NATURE GENETICS, 2001, 29 (02) :229-232
[5]   The structure of haplotype blocks in the human genome [J].
Gabriel, SB ;
Schaffner, SF ;
Nguyen, H ;
Moore, JM ;
Roy, J ;
Blumenstiel, B ;
Higgins, J ;
DeFelice, M ;
Lochner, A ;
Faggart, M ;
Liu-Cordero, SN ;
Rotimi, C ;
Adeyemo, A ;
Cooper, R ;
Ward, R ;
Lander, ES ;
Daly, MJ ;
Altshuler, D .
SCIENCE, 2002, 296 (5576) :2225-2229
[6]   Screening a large reference sample to identify very low frequency sequence variants: comparisons between two genes [J].
Glatt, CE ;
DeYoung, JA ;
Delgado, S ;
Service, SK ;
Giacomini, KM ;
Edwards, RH ;
Risch, N ;
Freimer, NB .
NATURE GENETICS, 2001, 27 (04) :435-438
[7]   Mitochondrial genome variation and the origin of modern humans [J].
Ingman, M ;
Kaessmann, H ;
Pääbo, S ;
Gyllensten, U .
NATURE, 2000, 408 (6813) :708-713
[8]   Haplotype tagging for the identification of common disease genes [J].
Johnson, GCL ;
Esposito, L ;
Barratt, BJ ;
Smith, AN ;
Heward, J ;
Di Genova, G ;
Ueda, H ;
Cordell, HJ ;
Eaves, IA ;
Dudbridge, F ;
Twells, RCJ ;
Payne, F ;
Hughes, W ;
Nutland, S ;
Stevens, H ;
Carr, P ;
Tuomilehto-Wolf, E ;
Tuomilehto, J ;
Gough, SCL ;
Clayton, DG ;
Todd, JA .
NATURE GENETICS, 2001, 29 (02) :233-237
[9]   The new genomics: Global views of biology [J].
Lander, ES .
SCIENCE, 1996, 274 (5287) :536-539
[10]   Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms [J].
Niu, TH ;
Qin, ZHS ;
Xu, XP ;
Liu, JS .
AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 70 (01) :157-169