PREDICTION OF HUMAN MESSENGER-RNA DONOR AND ACCEPTOR SITES FROM THE DNA-SEQUENCE

被引:631
作者
BRUNAK, S [1 ]
ENGELBRECHT, J [1 ]
KNUDSEN, S [1 ]
机构
[1] HARVARD UNIV, SCH PUBL HLTH, DANA FARBER CANC INST, BOSTON, MA 02115 USA
关键词
INTRON-SPLICING; HUMAN GENES; EXON SELECTION; NEURAL NETWORK; COMPUTER-PREDICTION;
D O I
10.1016/0022-2836(91)90380-O
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Artificial neural networks have been applied to the prediction of splice site location in human pre-mRNA. A joint prediction scheme where prediction of transition regions between introns and exons regulates a cutoff level for splice site assignment was able to predict splice site locations with confidence levels far better than previously reported in the literature. The problem of predicting donor and acceptor sites in human genes is hampered by the presence of numerous amounts of false positives: here the distribution of these false splice sites is examined and linked to a possible scenario for the splicing mechanism in vivo. When the presented method detects 95% of the true donor and acceptor sites, it makes less than 0·1 % false donor site assignments and less than 0·4% false acceptor site assignments. For the large data set used in this study, this means that on average there are one and a half false donor sites per true donor site and six false acceptor sites per true acceptor site. With the joint assignment method, more than a fifth of the true donor sites and around one fourth of the true acceptor sites could be detected without accompaniment of any false positive predictions. Highly confident splice sites could not be isolated with a widely used weight matrix method or by separate splice site networks. A complementary relation between the confidence levels of the coding/non-coding and the separate splice site networks was observed, with many weak splice sites having sharp transitions in the coding/non-coding signal and many stronger splice sites having more ill-defined transitions between coding and non-coding. © 1991.
引用
收藏
页码:49 / 65
页数:17
相关论文
共 51 条
[1]  
[Anonymous], 1987, LEARNING INTERNAL RE
[2]   A NOVEL-APPROACH TO PREDICTION OF THE 3-DIMENSIONAL STRUCTURES OF PROTEIN BACKBONES BY NEURAL NETWORKS [J].
BOHR, H ;
BOHR, J ;
BRUNAK, S ;
COTTERILL, RMJ ;
FREDHOLM, H ;
LAUTRUP, B ;
PETERSEN, SB .
FEBS LETTERS, 1990, 261 (01) :43-46
[3]   PROTEIN SECONDARY STRUCTURE AND HOMOLOGY BY NEURAL NETWORKS - THE ALPHA-HELICES IN RHODOPSIN [J].
BOHR, H ;
BOHR, J ;
BRUNAK, S ;
COTTERILL, RMJ ;
LAUTRUP, B ;
NORSKOV, L ;
OLSEN, OH ;
PETERSEN, SB .
FEBS LETTERS, 1988, 241 (1-2) :223-228
[4]   NEURAL NETWORK DETECTS ERRORS IN THE ASSIGNMENT OF MESSENGER-RNA SPLICE SITES [J].
BRUNAK, S ;
ENGELBRECHT, J ;
KNUDSEN, S .
NUCLEIC ACIDS RESEARCH, 1990, 18 (16) :4797-4801
[5]   CLEANING UP GENE DATABASES [J].
BRUNAK, S ;
ENGELBRECHT, J ;
KNUDSEN, S .
NATURE, 1990, 343 (6254) :123-123
[6]   THE 3' SPLICE SITE OF PRE-MESSENGER RNA IS RECOGNIZED BY A SMALL NUCLEAR RIBONUCLEOPROTEIN [J].
CHABOT, B ;
BLACK, DL ;
LEMASTER, DM ;
STEITZ, JA .
SCIENCE, 1985, 230 (4732) :1344-1349
[7]   RECOGNITION OF PROTEIN CODING REGIONS IN DNA-SEQUENCES [J].
FICKETT, JW .
NUCLEIC ACIDS RESEARCH, 1982, 10 (17) :5303-5318
[8]   THE LENGTH OF THE DOWNSTREAM EXON AND THE SUBSTITUTION OF SPECIFIC SEQUENCES AFFECT PRE-MESSENGER RNA SPLICING INVITRO [J].
FURDON, PJ ;
KOLE, R .
MOLECULAR AND CELLULAR BIOLOGY, 1988, 8 (02) :860-866
[9]   INHIBITION OF SPLICING BUT NOT CLEAVAGE AT THE 5' SPLICE SITE BY TRUNCATING HUMAN BETA-GLOBIN PRE-MESSENGER-RNA [J].
FURDON, PJ ;
KOLE, R .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1986, 83 (04) :927-931
[10]   THE PROTEIN IDENTIFICATION RESOURCE (PIR) [J].
GEORGE, DG ;
BARKER, WC ;
HUNT, LT .
NUCLEIC ACIDS RESEARCH, 1986, 14 (01) :11-15