microPred: effective classification of pre-miRNAs for human miRNA gene prediction

被引:188
作者
Batuwita, Rukshan [1 ]
Palade, Vasile [1 ]
机构
[1] Univ Oxford, Oxford Univ Comp Lab, Oxford OX1 3QD, England
关键词
SUPPORT VECTOR MACHINES; MICRORNAS; IDENTIFICATION; RNA; GENOMICS;
D O I
10.1093/bioinformatics/btp107
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In this article, we show that the classification of human precursor microRNA (pre-miRNAs) hairpins from both genome pseudo hairpins and other non-coding RNAs (ncRNAs) is a common and essential requirement for both comparative and non-comparative computational recognition of human miRNA genes. However, the existing computational methods do not address this issue completely or successfully. Here we present the development of an effective classifier system (named as microPred) for this classification problem by using appropriate machine learning techniques. Our approach includes the introduction of more representative datasets, extraction of new biologically relevant features, feature selection, handling of class imbalance problem in the datasets and extensive classifier performance evaluation via systematic cross-validation methods. Results: Our microPred classifier yielded higher and, especially, much more reliable classification results in terms of both sensitivity (90.02%) and specificity (97.28%) than the exiting pre-miRNA classification methods. When validated with 6095 non-human animal pre-miRNAs and 139 virus pre-miRNAs from miRBase, microPred resulted in 92.71% (5651/6095) and 94.24% (131/139) recognition rates, respectively.
引用
收藏
页码:989 / 995
页数:7
相关论文
共 43 条
[1]   Applying support vector machines to imbalanced datasets [J].
Akbani, R ;
Kwek, S ;
Japkowicz, N .
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 :39-50
[2]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[3]  
[Anonymous], 1999, Proceedings of the International Joint Conference on Artificial Intelligence
[4]  
[Anonymous], Journal of machine learning research
[5]   MicroRNAs: Genomics, biogenesis, mechanism, and function (Reprinted from Cell, vol 116, pg 281-297, 2004) [J].
Bartel, David P. .
CELL, 2007, 131 (04) :11-29
[6]   Identification of hundreds of conserved and nonconserved human microRNAs [J].
Bentwich, I ;
Avniel, A ;
Karov, Y ;
Aharonov, R ;
Gilad, S ;
Barad, O ;
Barzilai, A ;
Einat, P ;
Einav, U ;
Meiri, E ;
Sharon, E ;
Spector, Y ;
Bentwich, Z .
NATURE GENETICS, 2005, 37 (07) :766-770
[7]   Phylogenetic shadowing and computational identification of human microRNA genes [J].
Berezikov, E ;
Guryev, V ;
van de Belt, J ;
Wienholds, E ;
Plasterk, RHA ;
Cuppen, E .
CELL, 2005, 120 (01) :21-24
[8]   Approaches to microRNA discovery [J].
Berezikov, Eugene ;
Cuppen, Edwin ;
Plasterk, Ronald H. A. .
NATURE GENETICS, 2006, 38 (Suppl 6) :S2-S7
[9]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[10]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)