Alignment of whole genomes

被引:538
作者
Delcher, AL
Kasif, S
Fleischmann, RD
Peterson, J
White, O
Salzberg, SL
机构
[1] Inst Genom Res, Rockville, MD 20850 USA
[2] Loyola Coll, Dept Comp Sci, Baltimore, MD 21210 USA
[3] Celera Genom, Rockville, MD 20850 USA
[4] Univ Illinois, Dept Comp Sci & Elect Engn, Chicago, IL 60607 USA
关键词
D O I
10.1093/nar/27.11.2369
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6, In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.
引用
收藏
页码:2369 / 2376
页数:8
相关论文
共 32 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
Ansari-Lari MA, 1998, GENOME RES, V8, P29
[4]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[5]   Predicting gene regulatory elements in silico on a genomic scale [J].
Brazma, A ;
Jonassen, I ;
Vilo, J ;
Ukkonen, E .
GENOME RESEARCH, 1998, 8 (11) :1202-1215
[6]  
Chao KM, 1997, COMPUT APPL BIOSCI, V13, P75
[7]  
CHAO KM, 1995, COMPUT APPL BIOSCI, V11, P147
[8]  
CHEN MT, 1985, COMBINATORIAL ALGORI, P97
[9]   Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence [J].
Cole, ST ;
Brosch, R ;
Parkhill, J ;
Garnier, T ;
Churcher, C ;
Harris, D ;
Gordon, SV ;
Eiglmeier, K ;
Gas, S ;
Barry, CE ;
Tekaia, F ;
Badcock, K ;
Basham, D ;
Brown, D ;
Chillingworth, T ;
Connor, R ;
Davies, R ;
Devlin, K ;
Feltwell, T ;
Gentles, S ;
Hamlin, N ;
Holroyd, S ;
Hornby, T ;
Jagels, K ;
Krogh, A ;
McLean, J ;
Moule, S ;
Murphy, L ;
Oliver, K ;
Osborne, J ;
Quail, MA ;
Rajandream, MA ;
Rogers, J ;
Rutter, S ;
Seeger, K ;
Skelton, J ;
Squares, R ;
Squares, S ;
Sulston, JE ;
Taylor, K ;
Whitehead, S ;
Barrell, BG .
NATURE, 1998, 393 (6685) :537-+
[10]   WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD [J].
FLEISCHMANN, RD ;
ADAMS, MD ;
WHITE, O ;
CLAYTON, RA ;
KIRKNESS, EF ;
KERLAVAGE, AR ;
BULT, CJ ;
TOMB, JF ;
DOUGHERTY, BA ;
MERRICK, JM ;
MCKENNEY, K ;
SUTTON, G ;
FITZHUGH, W ;
FIELDS, C ;
GOCAYNE, JD ;
SCOTT, J ;
SHIRLEY, R ;
LIU, LI ;
GLODEK, A ;
KELLEY, JM ;
WEIDMAN, JF ;
PHILLIPS, CA ;
SPRIGGS, T ;
HEDBLOM, E ;
COTTON, MD ;
UTTERBACK, TR ;
HANNA, MC ;
NGUYEN, DT ;
SAUDEK, DM ;
BRANDON, RC ;
FINE, LD ;
FRITCHMAN, JL ;
FUHRMANN, JL ;
GEOGHAGEN, NSM ;
GNEHM, CL ;
MCDONALD, LA ;
SMALL, KV ;
FRASER, CM ;
SMITH, HO ;
VENTER, JC .
SCIENCE, 1995, 269 (5223) :496-512