A new distance measure for comparing sequence profiles based on path lengths along an entropy surface

被引:12
作者
Benson, G [1 ]
机构
[1] CUNY Mt Sinai Sch Med, Dept Biomath Sci, New York, NY 10029 USA
关键词
D O I
10.1093/bioinformatics/18.suppl_2.S44
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We describe a new distance measure for comparing DNA sequence profiles. For this measure, columns in a multiple alignment are treated as character frequency vectors (sum of the frequencies equal to one). The distance between two vectors is based on minimum path length along an entropy surface. Path length is estimated using a random graph generated on the entropy surface and Dijkstra's algorithm for all shortest paths to a source. We use the new distance measure to analyze similarities within familes of tandem repeats in the C. elegans genome and show that this new measure gives more accurate refinement of family relationships than a method based on comparing consensus sequences.
引用
收藏
页码:S44 / S53
页数:10
相关论文
共 30 条
[1]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[2]   SUSCEPTIBILITY TO HUMAN TYPE-1 DIABETES AT IDDM2 IS DETERMINED BY TANDEM REPEAT VARIATION AT THE INSULIN GENE MINISATELLITE LOCUS [J].
BENNETT, ST ;
LUCASSEN, AM ;
GOUGH, SCL ;
POWELL, EE ;
UNDLIEN, DE ;
PRITCHARD, LE ;
MERRIMAN, ME ;
KAWAGUCHI, Y ;
DRONSFIELD, MJ ;
POCIOT, F ;
NERUP, J ;
BOUZEKRI, N ;
CAMBONTHOMSEN, A ;
RONNINGEN, KS ;
BARNETT, AH ;
BAIN, SC ;
TODD, JA .
NATURE GENETICS, 1995, 9 (03) :284-292
[3]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[4]   Sequence alignment with tandem duplication [J].
Benson, G .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1997, 4 (03) :351-367
[5]  
Cormen T. H., 1990, INTRO ALGORITHMS
[6]  
Dijkstra E.W., 1959, Numerische mathematik, V1, P269, DOI [10.1007/BF01386390, DOI 10.1007/BF01386390]
[7]  
FLECHE PL, 2001, BMC MICROBIOL, V1, P2
[8]   AN UNSTABLE TRIPLET REPEAT IN A GENE RELATED TO MYOTONIC MUSCULAR-DYSTROPHY [J].
FU, YH ;
PIZZUTI, A ;
FENWICK, RG ;
KING, J ;
RAJNARAYAN, S ;
DUNNE, PW ;
DUBEL, J ;
NASSER, GA ;
ASHIZAWA, T ;
DEJONG, P ;
WIERINGA, B ;
KORNELUK, R ;
PERRYMAN, MB ;
EPSTEIN, HF ;
CASKEY, CT .
SCIENCE, 1992, 255 (5049) :1256-1258
[9]  
GRIBSKOV M, 1990, METHOD ENZYMOL, V183, P146
[10]   Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis [J].
Keim, P ;
Price, LB ;
Klevytska, AM ;
Smith, KL ;
Schupp, JM ;
Okinaka, R ;
Jackson, PJ ;
Hugh-Jones, ME .
JOURNAL OF BACTERIOLOGY, 2000, 182 (10) :2928-2936