OVER-REPRESENTATION AND UNDER-REPRESENTATION OF SHORT OLIGONUCLEOTIDES IN DNA-SEQUENCES

被引:306
作者
BURGE, C [1 ]
CAMPBELL, AM [1 ]
KARLIN, S [1 ]
机构
[1] STANFORD UNIV,DEPT BIOL SCI,STANFORD,CA 94305
关键词
D O I
10.1073/pnas.89.4.1358
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Strand-symmetric relative abundance functionals for di-, tri-, and tetranucleotides are introduced and applied to sequences encompassing a broad phylogenetic range to discern tendencies and anomalies in the occurrences of these short oligonucleotides within and between genomic sequences. For dinucleotides, TA is almost universally under-represented, with the exception of vertebrate mitochondrial genomes, and CG is strongly under-represented in vertebrates and in mitochondrial genomes. The traditional methylation/deamination/mutation hypothesis for the rarity of CG does not adequately account for the observed deficiencies in certain sequences, notably the mitochondrial genomes, yeast, and Neurospora crassa, which lack the standard CpG methylase. Homodinucleotides (AA.TT, CC.GG) and larger homooligonucleotides are over-represented in many organisms, perhaps due to polymerase slippage events. For trinucleotides, GCA.TGC tends to be under-represented in phage, human viral, and eukaryotic sequences, and CTA.TAG is strongly under-represented in many prokaryotic, eukaryotic, and viral sequences. The CCA.TGG triplet is ubiquitously over-represented in human viral and eukaryotic sequences. Among the tetranucleotides, several four-base-pair palindromes tend to be under-represented in phage sequences, probably as a means of restriction avoidance. The tetranucleotide CTAG is observed to be rare in virtually all bacterial genomes and some phage genomes. Explanations for these over- and under-representations in terms of DNA/RNA structures and regulatory mechanisms are considered.
引用
收藏
页码:1358 / 1362
页数:5
相关论文
共 24 条
  • [1] COMPOSITIONAL PATTERNS IN VERTEBRATE GENOMES - CONSERVATION AND CHANGE IN EVOLUTION
    BERNARDI, G
    MOUCHIROUD, D
    GAUTIER, C
    BERNARDI, G
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1988, 28 (1-2) : 7 - 18
  • [2] EVOLUTION OF THE GENOME AND THE GENETIC-CODE - SELECTION AT THE DINUCLEOTIDE LEVEL BY METHYLATION AND POLYRIBONUCLEOTIDE CLEAVAGE
    BEUTLER, E
    GELBART, T
    HAN, JH
    KOZIOL, JA
    BEUTLER, B
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1989, 86 (01) : 192 - 196
  • [3] CPG-RICH ISLANDS AND THE FUNCTION OF DNA METHYLATION
    BIRD, AP
    [J]. NATURE, 1986, 321 (6067) : 209 - 213
  • [4] DNA METHYLATION AND DEVELOPMENT
    CEDAR, H
    RAZIN, A
    [J]. BIOCHIMICA ET BIOPHYSICA ACTA, 1990, 1049 (01) : 1 - 8
  • [5] PALINDROMIC UNITS ARE PART OF A NEW BACTERIAL INTERSPERSED MOSAIC ELEMENT (BIME)
    GILSON, E
    SAURIN, W
    PERRIN, D
    BACHELLIER, S
    HOFNUNG, M
    [J]. NUCLEIC ACIDS RESEARCH, 1991, 19 (07) : 1375 - 1383
  • [6] NUCLEOTIDE-SEQUENCE AND EXPRESSION OF ESCHERICHIA-COLI TRPR, THE STRUCTURAL GENE FOR THE TRP APOREPRESSOR
    GUNSALUS, RP
    YANOFSKY, C
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1980, 77 (12): : 7117 - 7121
  • [7] HOLLANDER M, 1973, NONPARAMETRIC STATIS
  • [8] DEVIATIONS FROM EXPECTED FREQUENCIES OF CPG DINUCLEOTIDES IN HERPESVIRUS DNAS MAY BE DIAGNOSTIC OF DIFFERENCES IN THE STATES OF THEIR LATENT GENOMES
    HONESS, RW
    GOMPELS, UA
    BARRELL, BG
    CRAXTON, M
    CAMERON, KR
    STADEN, R
    CHANG, YN
    HAYWARD, GS
    [J]. JOURNAL OF GENERAL VIROLOGY, 1989, 70 : 837 - 855
  • [10] JOSSE J, 1961, J BIOL CHEM, V236, P864