Determination of bias in the relative abundance of oligonucleotides in DNA sequences

被引:16
作者
Elhai, J [1 ]
机构
[1] Univ Richmond, Dept Biol, Richmond, VA 23173 USA
关键词
compositional bias; genome; GC composition; palindrome; restriction/modification;
D O I
10.1089/106652701300312922
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Different statistical measures of bias of oligonucleotide sequences in DNA sequences were compared, both by theoretical analysis and according to their abilities to predict the relative abundances of oligonucleotides in the genome of Escherichia coli. The expected frequency of an oligonucleotide calculated from a maximal order Markov model was shown to be a degenerate case of the expected frequency calculated from biases of all subwords arising when noncontiguous subwords exhibit no bias. Since (at least in E, coli) noncontiguous sequences exhibit significant bias, the total compositional bias approach is expected to represent biases in genomic sequences more faithfully than Markov approaches. In fact, the efficacy of statistics based on Markov analysis even at the highest order were inferior in predicting actual frequencies of oligonucleotides to methods that factored out biases of internal subwords with gaps. Using total compositional bias as a measure of relative abundance, tetranucleotide and hexanucleotide palindromes were found to be distributed differently from nonpalindromic sequences, with their means shifted somewhat towards underrepresentation, A subpopulation of palindromic hexanucleotides, however, was highly underrepresented, and this group consisted almost entirely of targets for Type II restriction enzymes found within strains of E, coli, Sites recognized by Type I endonucleases from related strains were not markedly biased, and with pentanucleotides, palindromic and nonpalindromic sequences had nearly identical distributions. The loss of restriction sites may be explained by the free transfer of plasmids encoding restriction enzymes and episodic selection for the presence of the enzymes.
引用
收藏
页码:151 / 175
页数:25
相关论文
共 19 条
[1]   BIOLOGY OF DNA RESTRICTION [J].
BICKLE, TA ;
KRUGER, DH .
MICROBIOLOGICAL REVIEWS, 1993, 57 (02) :434-450
[2]   AN IMPROVED METHOD FOR DETECTION OF WORDS WITH UNUSUAL OCCURRENCE FREQUENCY IN NUCLEOTIDIC SEQUENCES [J].
COLOSIMO, A ;
MORANTE, S ;
PARISI, V ;
ROSSI, GC .
JOURNAL OF THEORETICAL BIOLOGY, 1993, 165 (04) :659-672
[3]   Reduction of conjugal transfer efficiency by three restriction activities of Anabaena sp. strain PCC 7120 [J].
ElhaI, J ;
Vepritskiy, A ;
MuroPastor, AM ;
Flores, E ;
Wolk, CP .
JOURNAL OF BACTERIOLOGY, 1997, 179 (06) :1998-2005
[4]   Avoidance of palindromic words in bacterial and archaeal genomes: A close connection with restriction enzymes [J].
Gelfand, MS ;
Koonin, EV .
NUCLEIC ACIDS RESEARCH, 1997, 25 (12) :2430-2439
[5]   COMPUTATIONAL DNA-SEQUENCE ANALYSIS [J].
KARLIN, S ;
CARDON, LR .
ANNUAL REVIEW OF MICROBIOLOGY, 1994, 48 :619-654
[6]   STATISTICAL-ANALYSES OF COUNTS AND DISTRIBUTIONS OF RESTRICTION SITES IN DNA-SEQUENCES [J].
KARLIN, S ;
BURGE, C ;
CAMPBELL, AM .
NUCLEIC ACIDS RESEARCH, 1992, 20 (06) :1363-1370
[7]   Comparative DNA analysis across diverse genomes [J].
Karlin, S ;
Campbell, AM ;
Mrázek, J .
ANNUAL REVIEW OF GENETICS, 1998, 32 :185-225
[8]   Compositional biases of bacterial genomes and evolutionary implications [J].
Karlin, S ;
Mrazek, J ;
Campbell, AM .
JOURNAL OF BACTERIOLOGY, 1997, 179 (12) :3899-3913
[9]  
KLEFFE J, 1992, COMPUT APPL BIOSCI, V8, P433
[10]   Shaping the genome - restriction-modification systems as mobile genetic elements [J].
Kobayashi, I ;
Nobusato, A ;
Kobayashi-Takahashi, N ;
Uchiyama, I .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 1999, 9 (06) :649-656