Calling SNPs without a reference sequence

被引:32
作者
Ratan, Aakrosh [1 ]
Yu Zhang [1 ]
Hayes, Vanessa M. [2 ]
Schuster, Stephan C. [1 ]
Miller, Webb [1 ]
机构
[1] Penn State Univ, Ctr Comparat Genom & Bioinformat, University Pk, PA 16802 USA
[2] Univ New S Wales, Childrens Canc Inst Australia Med Res, Randwick, NSW, Australia
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
REDUCED REPRESENTATION; WHOLE-GENOME; DISCOVERY; SELECTION; HETEROZYGOSITY;
D O I
10.1186/1471-2105-11-130
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The most common application for the next-generation sequencing technologies is resequencing, where short reads from the genome of an individual are aligned to a reference genome sequence for the same species. These mappings can then be used to identify genetic differences among individuals in a population, and perhaps ultimately to explain phenotypic variation. Many algorithms capable of aligning short reads to the reference, and determining differences between them have been reported. Much less has been reported on how to use these technologies to determine genetic differences among individuals of a species for which a reference sequence is not available, which drastically limits the number of species that can easily benefit from these new technologies. Results: We describe a computational pipeline, called DIAL (De novo Identification of Alleles), for identifying single-base substitutions between two closely related genomes without the help of a reference genome. The method works even when the depth of coverage is insufficient for de novo assembly, and it can be extended to determine small insertions/deletions. We evaluate the software's effectiveness using published Roche/454 sequence data from the genome of Dr. James Watson (to detect heterozygous positions) and recent Illumina data from orangutan, in each case comparing our results to those from computational analysis that uses a reference genome assembly. We also illustrate the use of DIAL to identify nucleotide differences among transcriptome sequences. Conclusions: DIAL can be used for identification of nucleotide differences in species for which no reference sequence is available. Our main motivation is to use this tool to survey the genetic diversity of endangered species as the identified sequence differences can be used to design genotyping arrays to assist in the species' management. The DIAL source code is freely available at http://www.bx.psu.edu/miller_lab/.
引用
收藏
页数:13
相关论文
共 25 条
[1]   The first Korean genome sequence and analysis: Full genome sequencing for a socio-ethnic group [J].
Ahn, Sung-Min ;
Kim, Tae-Hyung ;
Lee, Sunghoon ;
Kim, Deokhoon ;
Ghang, Ho ;
Kim, Dae-Soo ;
Kim, Byoung-Chul ;
Kim, Sang-Yoon ;
Kim, Woo-Yeon ;
Kim, Chulhong ;
Park, Daeui ;
Lee, Yong Seok ;
Kim, Sangsoo ;
Reja, Rohit ;
Jho, Sungwoong ;
Kim, Chang Geun ;
Cha, Ji-Young ;
Kim, Kyung-Hee ;
Lee, Bonghee ;
Bhak, Jong ;
Kim, Seong-Jin .
GENOME RESEARCH, 2009, 19 (09) :1622-1629
[2]   Direct selection of human genomic loci by microarray hybridization [J].
Albert, Thomas J. ;
Molla, Michael N. ;
Muzny, Donna M. ;
Nazareth, Lynne ;
Wheeler, David ;
Song, Xingzhi ;
Richmond, Todd A. ;
Middle, Chris M. ;
Rodesch, Matthew J. ;
Packard, Charles J. ;
Weinstock, George M. ;
Gibbs, Richard A. .
NATURE METHODS, 2007, 4 (11) :903-905
[3]   An SNP map of the human genome generated by reduced representation shotgun sequencing [J].
Altshuler, D ;
Pollara, VJ ;
Cowles, CR ;
Van Etten, WJ ;
Baldwin, J ;
Linton, L ;
Lander, ES .
NATURE, 2000, 407 (6803) :513-516
[4]   Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome [J].
Amaral, Andreia J. ;
Megens, Hendrik-Jan ;
Kerstens, Hindrik H. D. ;
Heuven, Henri C. M. ;
Dibbits, Bert ;
Crooijmans, Richard P. M. A. ;
Den Dunnen, Johan T. ;
Groenen, Martien A. M. .
BMC GENOMICS, 2009, 10 :374
[5]   Selection for Heterozygosity Gives Hope to a Wild Population of Inbred Wolves [J].
Bensch, Staffan ;
Andren, Henrik ;
Hansson, Bengt ;
Pedersen, Hans Chr. ;
Sand, Hakan ;
Sejberg, Douglas ;
Wabakken, Petter ;
Akesson, Mikael ;
Liberg, Olof .
PLOS ONE, 2006, 1 (01)
[6]   Wildlife biology - A devil of a disease [J].
Bostanci, A .
SCIENCE, 2005, 307 (5712) :1035-1035
[7]   ALLPATHS: De novo assembly of whole-genome shotgun microreads [J].
Butler, Jonathan ;
MacCallum, Iain ;
Kleber, Michael ;
Shlyakhter, Ilya A. ;
Belmonte, Matthew K. ;
Lander, Eric S. ;
Nusbaum, Chad ;
Jaffe, David B. .
GENOME RESEARCH, 2008, 18 (05) :810-820
[8]   Reduced heterozygosity impairs sperm quality in endangered mammals [J].
Fitzpatrick, John L. ;
Evans, Jonathan P. .
BIOLOGY LETTERS, 2009, 5 (03) :320-323
[9]   Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing [J].
Gnirke, Andreas ;
Melnikov, Alexandre ;
Maguire, Jared ;
Rogov, Peter ;
LeProust, Emily M. ;
Brockman, William ;
Fennell, Timothy ;
Giannoukos, Georgia ;
Fisher, Sheila ;
Russ, Carsten ;
Gabriel, Stacey ;
Jaffe, David B. ;
Lander, Eric S. ;
Nusbaum, Chad .
NATURE BIOTECHNOLOGY, 2009, 27 (02) :182-189
[10]  
Harris R. S., 2007, THESIS PENN STATE U