Alignment-free comparison;
Graphical representation;
DNA sequence;
Numerical characterization;
Phylogenetic tree;
2D GRAPHICAL REPRESENTATION;
FEATURE FREQUENCY PROFILES;
DNA-SEQUENCES;
PROTEIN SEQUENCES;
SIMILARITY;
PHYLOGENY;
DISSIMILARITY;
D O I:
10.1016/j.jtbi.2011.04.003
中图分类号:
Q [生物科学];
学科分类号:
090105 [作物生产系统与生态工程];
摘要:
In order to compare different genome sequences, an alignment-free method has proposed. First, we presented a new graphical representation of DNA sequences without degeneracy, which is conducive to intuitive comparison of sequences. Then, a new numerical characterization based on the representation was introduced to quantitatively depict the intrinsic nature of genome sequences, and considered as a 10-dimensional vector in the mathematical space. Alignment-free comparison of sequences was performed by computing the distances between vectors of the corresponding numerical characterizations, which define the evolutionary relationship. Two data sets of DNA sequences were constructed to assess the performance on sequence comparison. The results illustrate well validity of the method. The new numerical characterization provides a powerful tool for genome comparison. (C) 2011 Elsevier Ltd. All rights reserved.