ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations

被引:37
作者
Wright, Mark H. [1 ]
Tung, Chih-Wei [2 ]
Zhao, Keyan [1 ]
Reynolds, Andy [1 ]
McCouch, Susan R. [2 ]
Bustamante, Carlos D. [1 ]
机构
[1] Cornell Univ, Dept Biol Stat & Computat Biol, Ithaca, NY 14853 USA
[2] Cornell Univ, Dept Genet & Plant Breeding, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
GENOME-WIDE ASSOCIATION; DRAFT SEQUENCE; COMPLEX TRAITS; HAPLOTYPE MAP; RICE; ALGORITHM; ARRAYS; MAIZE;
D O I
10.1093/bioinformatics/btq533
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster. Results: As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called 'ALCHEMY' based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples.
引用
收藏
页码:2952 / 2960
页数:9
相关论文
共 21 条
[1]  
Affymetrix, 2006, CISC VIS NETW IND GL, V1, P18
[2]   A haplotype map of the human genome [J].
Altshuler, D ;
Brooks, LD ;
Chakravarti, A ;
Collins, FS ;
Daly, MJ ;
Donnelly, P ;
Gibbs, RA ;
Belmont, JW ;
Boudreau, A ;
Leal, SM ;
Hardenbol, P ;
Pasternak, S ;
Wheeler, DA ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Zeng, CQ ;
Gao, Y ;
Hu, HR ;
Hu, WT ;
Li, CH ;
Lin, W ;
Liu, SQ ;
Pan, H ;
Tang, XL ;
Wang, J ;
Wang, W ;
Yu, J ;
Zhang, B ;
Zhang, QR ;
Zhao, HB ;
Zhao, H ;
Zhou, J ;
Gabriel, SB ;
Barry, R ;
Blumenstiel, B ;
Camargo, A ;
Defelice, M ;
Faggart, M ;
Goyette, M ;
Gupta, S ;
Moore, J ;
Nguyen, H ;
Onofrio, RC ;
Parkin, M ;
Roy, J ;
Stahl, E ;
Winchester, E ;
Ziaugra, L ;
Shen, Y .
NATURE, 2005, 437 (7063) :1299-1320
[3]   The Genetic Architecture of Maize Flowering Time [J].
Buckler, Edward S. ;
Holland, James B. ;
Bradbury, Peter J. ;
Acharya, Charlotte B. ;
Brown, Patrick J. ;
Browne, Chris ;
Ersoz, Elhan ;
Flint-Garcia, Sherry ;
Garcia, Arturo ;
Glaubitz, Jeffrey C. ;
Goodman, Major M. ;
Harjes, Carlos ;
Guill, Kate ;
Kroon, Dallas E. ;
Larsson, Sara ;
Lepak, Nicholas K. ;
Li, Huihui ;
Mitchell, Sharon E. ;
Pressoir, Gael ;
Peiffer, Jason A. ;
Rosas, Marco Oropeza ;
Rocheford, Torbert R. ;
Cinta Romay, M. ;
Romero, Susan ;
Salvo, Stella ;
Sanchez Villeda, Hector ;
da Silva, H. Sofia ;
Sun, Qi ;
Tian, Feng ;
Upadyayula, Narasimham ;
Ware, Doreen ;
Yates, Heather ;
Yu, Jianming ;
Zhang, Zhiwu ;
Kresovich, Stephen ;
McMullen, Michael D. .
SCIENCE, 2009, 325 (5941) :714-718
[4]   Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data [J].
Carvalho, Benilton ;
Bengtsson, Henrik ;
Speed, Terence P. ;
Irizarry, Rafael A. .
BIOSTATISTICS, 2007, 8 (02) :485-499
[5]   Highly parallel SNP genotyping [J].
Fan, JB ;
Oliphant, A ;
Shen, R ;
Kermani, BG ;
Garcia, F ;
Gunderson, KL ;
Hansen, M ;
Steemers, F ;
Butler, SL ;
Deloukas, P ;
Galver, L ;
Hunt, S ;
McBride, C ;
Bibikova, M ;
Rubano, T ;
Chen, J ;
Wickham, E ;
Doucet, D ;
Chang, W ;
Campbell, D ;
Zhang, B ;
Kruglyak, S ;
Bentley, D ;
Haas, J ;
Rigault, P ;
Zhou, L ;
Stuelpnagel, J ;
Chee, MS .
COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 2003, 68 :69-78
[6]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[7]   Genetic structure and diversity in Oryza sativa L. [J].
Garris, AJ ;
Tai, TH ;
Coburn, J ;
Kresovich, S ;
McCouch, S .
GENETICS, 2005, 169 (03) :1631-1638
[8]   A draft sequence of the rice genome (Oryza sativa L. ssp japonica) [J].
Goff, SA ;
Ricke, D ;
Lan, TH ;
Presting, G ;
Wang, RL ;
Dunn, M ;
Glazebrook, J ;
Sessions, A ;
Oeller, P ;
Varma, H ;
Hadley, D ;
Hutchinson, D ;
Martin, C ;
Katagiri, F ;
Lange, BM ;
Moughamer, T ;
Xia, Y ;
Budworth, P ;
Zhong, JP ;
Miguel, T ;
Paszkowski, U ;
Zhang, SP ;
Colbert, M ;
Sun, WL ;
Chen, LL ;
Cooper, B ;
Park, S ;
Wood, TC ;
Mao, L ;
Quail, P ;
Wing, R ;
Dean, R ;
Yu, YS ;
Zharkikh, A ;
Shen, R ;
Sahasrabudhe, S ;
Thomas, A ;
Cannings, R ;
Gutin, A ;
Pruss, D ;
Reid, J ;
Tavtigian, S ;
Mitchell, J ;
Eldredge, G ;
Scholl, T ;
Miller, RM ;
Bhatnagar, S ;
Adey, N ;
Rubano, T ;
Tusneem, N .
SCIENCE, 2002, 296 (5565) :92-100
[9]   Genome-wide association studies for common diseases and complex traits [J].
Hirschhorn, JN ;
Daly, MJ .
NATURE REVIEWS GENETICS, 2005, 6 (02) :95-108
[10]   SNP genotyping: Technologies and biomedical applications [J].
Kim, Sobin ;
Misra, Ashish .
ANNUAL REVIEW OF BIOMEDICAL ENGINEERING, 2007, 9 :289-320