Chaos game representation for comparison of whole genomes

被引:65
作者
Joseph, Jijoy [1 ]
Sasikumar, Roschen [1 ]
机构
[1] CSIR, Reg Res Lab, Thiruvananthapuram 695019, Kerala, India
关键词
D O I
10.1186/1471-2105-7-243
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Chaos game representation of genome sequences has been used for visual representation of genome sequence patterns as well as alignment-free comparisons of sequences based on oligonucleotide frequencies. However the potential of this representation for making alignment-based comparisons of whole genome sequences has not been exploited. Results: We present here a fast algorithm for identifying all local alignments between two long DNA sequences using the sequence information contained in CGR points. The local alignments can be depicted graphically in a dot-matrix plot or in text form, and the significant similarities and differences between the two sequences can be identified. We demonstrate the method through comparison of whole genomes of several microbial species. Given two closely related genomes we generate information on mismatches, insertions, deletions and shuffles that differentiate the two genomes. Conclusion: Addition of the possibility of large scale sequence alignment to the repertoire of alignment-free sequence analysis applications of chaos game representation, positions CGR as a powerful sequence analysis tool.
引用
收藏
页数:10
相关论文
共 12 条
[1]   Analysis of genomic sequences by Chaos Game Representation [J].
Almeida, JS ;
Carriço, JA ;
Maretzek, A ;
Noble, PA ;
Fletcher, M .
BIOINFORMATICS, 2001, 17 (05) :429-437
[2]  
BASU S, 1992, J MOL BIOL, V228, P715
[3]   Strategies and tools for whole-genome alignments [J].
Couronne, O ;
Poliakov, A ;
Bray, N ;
Ishkhanov, T ;
Ryaboy, D ;
Rubin, E ;
Pachter, L ;
Dubchak, I .
GENOME RESEARCH, 2003, 13 (01) :73-80
[4]   Genomic signature: Characterization and classification of species assessed by chaos game representation of sequences [J].
Deschavanne, PJ ;
Giron, A ;
Vilain, J ;
Fagot, G ;
Fertil, B .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (10) :1391-1399
[5]   NUCLEOTIDE, DINUCLEOTIDE AND TRINUCLEOTIDE FREQUENCIES EXPLAIN PATTERNS OBSERVED IN CHAOS GAME REPRESENTATIONS OF DNA-SEQUENCES [J].
GOLDMAN, N .
NUCLEIC ACIDS RESEARCH, 1993, 21 (10) :2487-2491
[6]  
HILL KA, 1992, J MOL EVOL, V35, P261
[7]   CHAOS GAME REPRESENTATION OF GENE STRUCTURE [J].
JEFFREY, HJ .
NUCLEIC ACIDS RESEARCH, 1990, 18 (08) :2163-2170
[8]   Versatile and open software for comparing large genomes [J].
Kurtz, S ;
Phillippy, A ;
Delcher, AL ;
Smoot, M ;
Shumway, M ;
Antonescu, C ;
Salzberg, SL .
GENOME BIOLOGY, 2004, 5 (02)
[9]   SSAHA: A fast search method for large DNA databases [J].
Ning, ZM ;
Cox, AJ ;
Mullikin, JC .
GENOME RESEARCH, 2001, 11 (10) :1725-1729
[10]   ENTROPIC PROFILES OF DNA-SEQUENCES THROUGH CHAOS-GAME-DERIVED IMAGES [J].
OLIVER, JL ;
BERNAOLAGALVAN, P ;
GUERREROGARCIA, J ;
ROMANROLDAN, R .
JOURNAL OF THEORETICAL BIOLOGY, 1993, 160 (04) :457-470