GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population

被引:41
作者
Giannoulatou, Eleni [1 ,2 ]
Yau, Christopher [1 ,2 ]
Colella, Stefano [3 ]
Ragoussis, Jiannis [3 ]
Holmes, Christopher C. [1 ,4 ]
机构
[1] Univ Oxford, Dept Stat, Oxford OX1 3TG, England
[2] Univ Oxford, Life Sci Interface Doctoral Training Ctr, Oxford OX1 3QD, England
[3] Wellcome Trust Ctr Human Genet, Genom Grp, Oxford OX2 7BN, England
[4] MRC, MRC Mammalian Genet Unit, Harwell OX11 0RD, Berks, England
基金
英国工程与自然科学研究理事会; 英国医学研究理事会;
关键词
D O I
10.1093/bioinformatics/btn386
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Current genotyping algorithms typically call genotypes by clustering allele-specific intensity data on a single nucleotide polymorphism (SNP) by SNP basis. This approach assumes the availability of a large number of control samples that have been sampled on the same array and platform. We have developed a SNP genotyping algorithm for the Illumina Infinium SNP genotyping assay that is entirely within-sample and does not require the need for a population of control samples nor parameters derived from such a population. Our algorithm exhibits high concordance with current methods and 99 call accuracy on HapMap samples. The ability to call genotypes using only within-sample information makes the method computationally light and practical for studies involving small sample sizes and provides a valuable independent quality control metric for other population-based approaches.
引用
收藏
页码:2209 / 2214
页数:6
相关论文
共 17 条
[1]  
*AFF INC, 2007, BIRDS ALG AFF GEN CO
[2]  
Affymetrix, 2006, CISC VIS NETW IND GL, V1, P18
[3]   A haplotype map of the human genome [J].
Altshuler, D ;
Brooks, LD ;
Chakravarti, A ;
Collins, FS ;
Daly, MJ ;
Donnelly, P ;
Gibbs, RA ;
Belmont, JW ;
Boudreau, A ;
Leal, SM ;
Hardenbol, P ;
Pasternak, S ;
Wheeler, DA ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Zeng, CQ ;
Gao, Y ;
Hu, HR ;
Hu, WT ;
Li, CH ;
Lin, W ;
Liu, SQ ;
Pan, H ;
Tang, XL ;
Wang, J ;
Wang, W ;
Yu, J ;
Zhang, B ;
Zhang, QR ;
Zhao, HB ;
Zhao, H ;
Zhou, J ;
Gabriel, SB ;
Barry, R ;
Blumenstiel, B ;
Camargo, A ;
Defelice, M ;
Faggart, M ;
Goyette, M ;
Gupta, S ;
Moore, J ;
Nguyen, H ;
Onofrio, RC ;
Parkin, M ;
Roy, J ;
Stahl, E ;
Winchester, E ;
Ziaugra, L ;
Shen, Y .
NATURE, 2005, 437 (7063) :1299-1320
[4]   Robust Bayesian clustering [J].
Archambeau, Cedric ;
Verleysen, Michel .
NEURAL NETWORKS, 2007, 20 (01) :129-138
[5]  
BEAL M, 2003, BAYESIAN STAT, V7
[6]   A comparison of normalization methods for high density oligonucleotide array data based on variance and bias [J].
Bolstad, BM ;
Irizarry, RA ;
Åstrand, M ;
Speed, TP .
BIOINFORMATICS, 2003, 19 (02) :185-193
[7]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]  
*ILLUM INC, 2005, SPOTL T ILLUM GENCAL
[10]   PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data [J].
Laframboise, Thomas ;
Harrington, David ;
Weir, Barbara A. .
BIOSTATISTICS, 2007, 8 (02) :323-336