Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine

被引:330
作者
Xue, CH
Li, F
He, T
Liu, GP
Li, YD
Zhang, XG [1 ]
机构
[1] Tsinghua Univ, Dept Automat, MOE Key Lab Bioinformat, Beijing 100084, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Lab Complex Syst & Intelligence Sci, Beijing 100080, Peoples R China
[3] Univ Glamorgan, Sch Elect, Pontypridd CF37 1DL, M Glam, Wales
关键词
D O I
10.1186/1471-2105-6-310
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: MicroRNAs (miRNAs) are a group of short (similar to 22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for computational prediction of miRNAs use comparative genomic approaches to identify putative pre-miRNAs from candidate hairpins. Ab initio method for distinguishing pre-miRNAs from sequence segments with pre-miRNA-like hairpin structures is lacking. Being able to classify real vs. pseudo pre-miRNAs is important both for understanding of the nature of miRNAs and for developing ab initio prediction methods that can discovery new miRNAs without known homology. Results: A set of novel features of local contiguous structure-sequence information is proposed for distinguishing the hairpins of real pre-miRNAs and pseudo pre-miRNAs. Support vector machine (SVM) is applied on these features to classify real vs. pseudo pre-miRNAs, achieving about 90% accuracy on human data. Remarkably, the SVM classifier built on human data can correctly identify up to 90% of the pre-miRNAs from other species, including plants and virus, without utilizing any comparative genomics information. Conclusion: The local structure-sequence features reflect discriminative and conserved characteristics of miRNAs, and the successful ab initio classification of real and pseudo pre-miRNAs opens a new approach for discovering new miRNAs.
引用
收藏
页数:7
相关论文
共 35 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   A uniform system for microRNA annotation [J].
Ambros, V ;
Bartel, B ;
Bartel, DP ;
Burge, CB ;
Carrington, JC ;
Chen, XM ;
Dreyfuss, G ;
Eddy, SR ;
Griffiths-Jones, S ;
Marshall, M ;
Matzke, M ;
Ruvkun, G ;
Tuschl, T .
RNA, 2003, 9 (03) :277-279
[3]   MicroRNAs: At the root of plant development? [J].
Bartel, B ;
Bartel, DP .
PLANT PHYSIOLOGY, 2003, 132 (02) :709-717
[4]   MicroRNAs: Genomics, biogenesis, mechanism, and function (Reprinted from Cell, vol 116, pg 281-297, 2004) [J].
Bartel, David P. .
CELL, 2007, 131 (04) :11-29
[5]   Identification of hundreds of conserved and nonconserved human microRNAs [J].
Bentwich, I ;
Avniel, A ;
Karov, Y ;
Aharonov, R ;
Gilad, S ;
Barad, O ;
Barzilai, A ;
Einat, P ;
Einav, U ;
Meiri, E ;
Sharon, E ;
Spector, Y ;
Bentwich, Z .
NATURE GENETICS, 2005, 37 (07) :766-770
[6]   Phylogenetic shadowing and computational identification of human microRNA genes [J].
Berezikov, E ;
Guryev, V ;
van de Belt, J ;
Wienholds, E ;
Plasterk, RHA ;
Cuppen, E .
CELL, 2005, 120 (01) :21-24
[7]   Detection of 91 potential in plant conserved plant microRNAs in Arabidopsis thaliana and Oryza sativa identifies important target genes [J].
Bonnet, E ;
Wuyts, J ;
Rouzé, P ;
Van de Peer, Y .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (31) :11511-11516
[8]   Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences [J].
Bonnet, E ;
Wuyts, J ;
Rouzé, P ;
Van de Peer, Y .
BIOINFORMATICS, 2004, 20 (17) :2911-2917
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482