TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples

被引:178
作者
Bandyopadhyay, Sanghamitra [1 ]
Mitra, Ramkrishna [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata, India
关键词
GENES;
D O I
10.1093/bioinformatics/btp503
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Prediction of microRNA (miRNA) target mRNAs using machine learning approaches is an important area of research. However, most of the methods suffer from either high false positive or false negative rates. One reason for this is the marked deficiency of negative examples or miRNA non-target pairs. Systematic identification of non-target mRNAs is still not addressed properly, and therefore, current machine learning approaches are compelled to rely on artificially generated negative examples for training. Results: In this article, we have identified similar to 300 tissue-specific negative examples using a novel approach that involves expression pro. ling of both miRNAs and mRNAs, miRNA-mRNA structural interactions and seed-site conservation. The newly generated negative examples are validated with pSILAC dataset, which elucidate the fact that the identified non-targets are indeed nontargets. These high-throughput tissue-specific negative examples and a set of experimentally verified positive examples are then used to build a system called TargetMiner, a support vector machine (SVM)-based classifier. In addition to assessing the prediction accuracy on cross-validation experiments, TargetMiner has been validated with a completely independent experimental test dataset. Our method outperforms 10 existing target prediction algorithms and provides a good balance between sensitivity and specificity that is not reflected in the existing methods. We achieve a significantly higher sensitivity and specificity of 69% and 67.8% based on a pool of 90 feature set and 76.5% and 66.1% using a set of 30 selected feature set on the completely independent test dataset. In order to establish the effectiveness of the systematically generated negative examples, the SVM is trained using a different set of negative data generated using the method in Yousef et al. A significantly higher false positive rate (70.6%) is observed when tested on the independent set, while all other factors are kept the same. Again, when an existing method (NBmiRTar) is executed with the our proposed negative data, we observe an improvement in its performance. These clearly establish the effectiveness of the proposed approach of selecting the negative examples systematically.
引用
收藏
页码:2625 / 2631
页数:7
相关论文
共 26 条
  • [1] Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes
    Baskerville, S
    Bartel, DP
    [J]. RNA, 2005, 11 (03) : 241 - 247
  • [2] The microRNA.org resource: targets and expression
    Betel, Doron
    Wilson, Manda
    Gabow, Aaron
    Marks, Debora S.
    Sander, Chris
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D149 - D153
  • [3] Enright AJ, 2004, GENOME BIOL, V5
  • [4] Most mammalian mRNAs are conserved targets of microRNAs
    Friedman, Robin C.
    Farh, Kyle Kai-How
    Burge, Christopher B.
    Bartel, David P.
    [J]. GENOME RESEARCH, 2009, 19 (01) : 92 - 105
  • [5] MicroRNA targeting specificity in mammals: Determinants beyond seed pairing
    Grimson, Andrew
    Farh, Kyle Kai-How
    Johnston, Wendy K.
    Garrett-Engele, Philip
    Lim, Lee P.
    Bartel, David P.
    [J]. MOLECULAR CELL, 2007, 27 (01) : 91 - 105
  • [6] Human MicroRNA targets
    John, B
    Enright, AJ
    Aravin, A
    Tuschl, T
    Sander, C
    Marks, DS
    [J]. PLOS BIOLOGY, 2004, 2 (11) : 1862 - 1879
  • [7] Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays
    Johnson, JM
    Castle, J
    Garrett-Engele, P
    Kan, ZY
    Loerch, PM
    Armour, CD
    Santos, R
    Schadt, EE
    Stoughton, R
    Shoemaker, DD
    [J]. SCIENCE, 2003, 302 (5653) : 2141 - 2144
  • [8] Detection of genes with tissue-specific expression patterns using Akaike's information criterion procedure
    Kadota, K
    Nishimura, SI
    Bono, H
    Nakamura, S
    Hayashizaki, Y
    Okazaki, Y
    Takahashi, K
    [J]. PHYSIOLOGICAL GENOMICS, 2003, 12 (03) : 251 - 259
  • [9] The role of site accessibility in microRNA target recognition
    Kertesz, Michael
    Iovino, Nicola
    Unnerstall, Ulrich
    Gaul, Ulrike
    Segal, Eran
    [J]. NATURE GENETICS, 2007, 39 (10) : 1278 - 1284
  • [10] A combined computational-experimental approach predicts human microRNA targets
    Kiriakidou, M
    Nelson, PT
    Kouranov, A
    Fitziev, P
    Bouyioukos, C
    Mourelatos, Z
    Hatzigeorgiou, A
    [J]. GENES & DEVELOPMENT, 2004, 18 (10) : 1165 - 1178