Natural selection and algorithmic design of mRNA

被引:22
作者
Cohen, B [1 ]
Skiena, S
机构
[1] New Jersey Inst Technol, Dept Comp Sci, Univ Heights, Newark, NJ 07102 USA
[2] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
关键词
RNA secondary structure; natural selection; synonymous coding sequence; RNA stability; molecular design;
D O I
10.1089/10665270360688101
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Messenger RNA (mRNA) sequences serve as templates for proteins according to the triplet code, in which each of the 4(3)=64 different codons (sequences of three consecutive nucleotide bases) in RNA either terminate transcription or map to one of the 20 different amino acids (or residues) which build up proteins. Because there are more codons than residues, there is inherent redundancy in the coding. Certain residues (e.g., tryptophan) have only a single corresponding codon, while other residues (e.g., arginine) have as many as six corresponding codons. This freedom implies that the number of possible RNA sequences coding for a given protein grows exponentially in the length of the protein. Thus nature has wide latitude to select among mRNA sequences which are informationally equivalent, but structurally and energetically divergent. In this paper, we explore how nature takes advantage of this freedom and how to algorithmically design structures more energetically favorable than have been built through natural selection. In particular: (1) Natural Selection-we perform the first large-scale computational experiment comparing the stability of mRNA sequences from a variety of organisms to random synonymous sequences which respect the codon preferences of the organism. This experiment was conducted on over 27,000 sequences from 34 microbial species with 36 genomic structures. We provide evidence that in all genomic structures highly stable sequences are disproportionately abundant, and in 19 of 36 cases highly unstable sequences are disproportionately abundant. This suggests that the stability of mRNA sequences is subject to natural selection. (2) Artificial Selection-motivated by these biological results, we examine the algorithmic problem of designing the most stable and unstable mRNA sequences which code for a target protein. We give a polynomial-time dynamic programming solution to the most stable sequence problem (MSSP), which is asymptotically no more complex than secondary structure prediction. We show that the corresponding least stable sequence problem (LSSP) is NP-complete, and develop two heuristics for the construction of such sequences. We have implemented these algorithms, and present experimental results placing the high/low stability sequences in context with both wildtype and random encodings. Our implementation has already been applied to the design of RNA "code-words" creating little or no secondary structure in RNA computing (Brenneman and Condon, 2001; Marathe et al., 2001), and we anticipate a variety of other applications of this work to sequence design problems (Skiena, 2001).
引用
收藏
页码:419 / 432
页数:14
相关论文
共 26 条
  • [1] Atkins J. F., 1999, RNA WORLD
  • [2] BRENNEMAN A, 2001, STRAND DESIGN BIOMOL
  • [3] De novo protein design: Fully automated sequence selection
    Dahiyat, BI
    Mayo, SL
    [J]. SCIENCE, 1997, 278 (5335) : 82 - 87
  • [4] de Bruijn NG, 1946, KONINKLIJKE NEDERLAN, V49, P758
  • [5] Design of multistable RNA molecules
    Flamm, C
    Hofacker, IL
    Maurer-Stroh, S
    Stadler, PF
    Zehl, M
    [J]. RNA, 2001, 7 (02) : 254 - 265
  • [6] Chargaff's legacy
    Forsdyke, DR
    Mortimer, JR
    [J]. GENE, 2000, 261 (01) : 127 - 137
  • [7] GOOD IJ, 1946, J LOND MATH SOC, V21, P167
  • [8] FAST FOLDING AND COMPARISON OF RNA SECONDARY STRUCTURES
    HOFACKER, IL
    FONTANA, W
    STADLER, PF
    BONHOEFFER, LS
    TACKER, M
    SCHUSTER, P
    [J]. MONATSHEFTE FUR CHEMIE, 1994, 125 (02): : 167 - 188
  • [9] Gradients in nucleotide and codon usage along Escherichia coli genes
    Hooper, SD
    Berg, OG
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (18) : 3517 - 3523
  • [10] IMPROVED PREDICTIONS OF SECONDARY STRUCTURES FOR RNA
    JAEGER, JA
    TURNER, DH
    ZUKER, M
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1989, 86 (20) : 7706 - 7710