Method of predicting Splice Sites based on signal interactions

被引:22
作者
Churbanov, Alexander [1 ]
Rogozin, Igor B.
Deogun, Jitender S.
Ali, Hesham
机构
[1] Univ Nebraska, Coll Informat Sci & Technol, Dept Comp Sci, Omaha, NE 68182 USA
[2] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
[3] Univ Nebraska, Dept Comp Sci & Engn, Lincoln, NE 68588 USA
关键词
D O I
10.1186/1745-6150-1-10
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Predicting and proper ranking of canonical splice sites (SSs) is a challenging problem in bioinformatics and machine learning communities. Any progress in SSs recognition will lead to better understanding of splicing mechanism. We introduce several new approaches of combining a priori knowledge for improved SS detection. First, we design our new Bayesian SS sensor based on oligonucleotide counting. To further enhance prediction quality, we applied our new de novo motif detection tool MHMMotif to intronic ends and exons. We combine elements found with sensor information using Naive Bayesian Network, as implemented in our new tool SpliceScan. Results: According to our tests, the Bayesian sensor outperforms the contemporary Maximum Entropy sensor for 5' SS detection. We report a number of putative Exonic (ESE) and Intronic (ISE) Splicing Enhancers found by MHMMotif tool. T-test statistics on mouse/rat intronic alignments indicates, that detected elements are on average more conserved as compared to other oligos, which supports our assumption of their functional importance. The tool has been shown to outperform the SpliceView, GeneSplicer, NNSplice, Genio and NetUTR tools for the test set of human genes. SpliceScan outperforms all contemporary ab initio gene structural prediction tools on the set of 5' UTR gene fragments. Conclusion: Designed methods have many attractive properties, compared to existing approaches. Bayesian sensor, MHMMotif program and SpliceScan tools are freely available on our web site.
引用
收藏
页数:23
相关论文
共 79 条
  • [1] [Anonymous], 1998, TR97021 INT COMP SCI
  • [2] Modeling splicing sites with pairwise correlations
    Arita, M
    Tsuda, K
    Asai, K
    [J]. BIOINFORMATICS, 2002, 18 : S27 - S34
  • [3] Bailey T., 1994, P 2 INT C INT SYST M, P28
  • [4] Patterns of variant polyadenylation signal usage in human genes
    Beaudoing, E
    Freier, S
    Wyatt, JR
    Claverie, JM
    Gautheret, D
    [J]. GENOME RESEARCH, 2000, 10 (07) : 1001 - 1010
  • [5] Ultraconserved elements in the human genome
    Bejerano, G
    Pheasant, M
    Makunin, I
    Stephen, S
    Kent, WJ
    Mattick, JS
    Haussler, D
    [J]. SCIENCE, 2004, 304 (5675) : 1321 - 1325
  • [6] PREDICTION OF HUMAN MESSENGER-RNA DONOR AND ACCEPTOR SITES FROM THE DNA-SEQUENCE
    BRUNAK, S
    ENGELBRECHT, J
    KNUDSEN, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1991, 220 (01) : 49 - 65
  • [7] NEURAL NETWORK DETECTS ERRORS IN THE ASSIGNMENT OF MESSENGER-RNA SPLICE SITES
    BRUNAK, S
    ENGELBRECHT, J
    KNUDSEN, S
    [J]. NUCLEIC ACIDS RESEARCH, 1990, 18 (16) : 4797 - 4801
  • [8] RNA-BINDING SPECIFICITY OF HNRNP A1 - SIGNIFICANCE OF HNRNP A1 HIGH-AFFINITY BINDING-SITES IN PRE-MESSENGER-RNA SPLICING
    BURD, CG
    DREYFUSS, G
    [J]. EMBO JOURNAL, 1994, 13 (05) : 1197 - 1204
  • [9] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94
  • [10] Alternative splicing:: multiple control mechanisms and involvement in human disease
    Cáceres, JF
    Kornblihtt, AR
    [J]. TRENDS IN GENETICS, 2002, 18 (04) : 186 - 193