MBSTAR: multiple instance learning for predicting specific functional binding sites in microRNA targets

被引:41
作者
Bandyopadhyay, Sanghamitra [1 ]
Ghosh, Dip [1 ]
Mitra, Ramkrishna [2 ]
Zhao, Zhongming [2 ,3 ,4 ,5 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata, India
[2] Vanderbilt Univ, Sch Med, Dept Biomed Informat, Nashville, TN 37203 USA
[3] Vanderbilt Univ, Sch Med, Dept Canc Biol, Nashville, TN 37232 USA
[4] Vanderbilt Univ, Sch Med, Dept Psychiat, Nashville, TN 37212 USA
[5] Vanderbilt Univ, Ctr Quantitat Sci, Nashville, TN 37232 USA
来源
SCIENTIFIC REPORTS | 2015年 / 5卷
基金
美国国家卫生研究院;
关键词
MESSENGER-RNAS; IDENTIFICATION; DATABASE;
D O I
10.1038/srep08004
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
MicroRNA (miRNA) regulates gene expression by binding to specific sites in the 3'untranslated regions of its target genes. Machine learning based miRNA target prediction algorithms first extract a set of features from potential binding sites (PBSs) in the mRNA and then train a classifier to distinguish targets from non-targets. However, they do not consider whether the PBSs are functional or not, and consequently result in high false positive rates. This substantially affects the follow up functional validation by experiments. We present a novel machine learning based approach, MBSTAR (Multiple instance learning of Binding Sites of miRNA TARgets), for accurate prediction of true or functional miRNA binding sites. Multiple instance learning framework is adopted to handle the lack of information about the actual binding sites in the target mRNAs. Biologically validated 9531 interacting and 973 non-interacting miRNA-mRNA pairs are identified from Tarbase 6.0 and confirmed with PAR-CLIP dataset. It is found that MBSTAR achieves the highest number of binding sites overlapping with PAR-CLIP with maximum F-Score of 0.337. Compared to the other methods, MBSTAR also predicts target mRNAs with highest accuracy.
引用
收藏
页数:12
相关论文
共 48 条
[1]   The functions of animal microRNAs [J].
Ambros, V .
NATURE, 2004, 431 (7006) :350-355
[2]   Shape quantization and recognition with randomized trees [J].
Amit, Y ;
Geman, D .
NEURAL COMPUTATION, 1997, 9 (07) :1545-1588
[3]  
[Anonymous], 2002, PROC 15 INT C NEURAL
[4]  
[Anonymous], 2000, ICML
[5]  
[Anonymous], 2011, PROC INT JOINT C ART, DOI DOI 10.1007/s10618-010-0179-5
[6]  
Bandyopadhyay S., 2007, ANAL BIOL DATA SOFT, V3
[7]   Analyzing miRNA co-expression networks to explore TF-miRNA regulation [J].
Bandyopadhyay, Sanghamitra ;
Bhattacharyya, Malay .
BMC BIOINFORMATICS, 2009, 10
[8]   TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples [J].
Bandyopadhyay, Sanghamitra ;
Mitra, Ramkrishna .
BIOINFORMATICS, 2009, 25 (20) :2625-2631
[9]   MicroRNAs: Genomics, biogenesis, mechanism, and function (Reprinted from Cell, vol 116, pg 281-297, 2004) [J].
Bartel, David P. .
CELL, 2007, 131 (04) :11-29
[10]   Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites [J].
Betel, Doron ;
Koppal, Anjali ;
Agius, Phaedra ;
Sander, Chris ;
Leslie, Christina .
GENOME BIOLOGY, 2010, 11 (08)