POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions

被引:107
作者
Hirose, Shuichi [1 ]
Shimizu, Kana
Kanai, Satoru
Kuroda, Yutaka
Noguchi, Tamotsu
机构
[1] PharmaDesign Inc, Tokyo 1040032, Japan
[2] Natl Inst Adv Ind Sci & Technol, Tokyo 1350064, Japan
[3] Tokyo Univ Agr & Technol, Grad Sch Engn, Dept Biotechnol & Life Sci, Koganei, Tokyo 1848588, Japan
关键词
D O I
10.1093/bioinformatics/btm302
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recent experimental and theoretical studies have revealed several proteins containing sequence segments that are unfolded under physiological conditions. These segments are called disordered regions. They are actively investigated because of their possible involvement in various biological processes, such as cell signaling, transcriptional and translational regulation. Additionally, disordered regions can represent a major obstacle to high-throughput proteome analysis and often need to be removed from experimental targets. The accurate prediction of long disordered regions is thus expected to provide annotations that are useful for a wide range of applications. Results: We developed Prediction Of Order and Disorder by machine LEarning (POODLE-L; L stands for long), the Support Vector Machines (SVMs) based method for predicting long disordered regions using 10 kinds of simple physico-chemical properties of amino acid. POODLE-L assembles the output of 10 two-level SVM predictors into a final prediction of disordered regions. The performance of POODLE-L for predicting long disordered regions, which exhibited a Matthew's correlation coefficient of 0.658, was the highest when compared with eight well-established publicly available disordered region predictors. Availability: POODLE-L is freely available at http://mbs.cbrc. jp/ poodle/poodle-l.html Contact: hirose-shuichi@aist.go.jp Supplementary information: Supplementary data are available at Bioinformatics online.
引用
收藏
页码:2046 / 2053
页数:8
相关论文
共 58 条
[31]  
NISHIKAWA K, 1991, METHOD ENZYMOL, V202, P31
[32]   PDB-REPRDB: a database of representative protein chains from the Protein Data Bank (PDB) in 2003 [J].
Noguchi, T ;
Akiyama, Y .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :492-493
[33]   Predicting intrinsic disorder from amino acid sequence [J].
Obradovic, Z ;
Peng, K ;
Vucetic, S ;
Radivojac, P ;
Brown, CJ ;
Dunker, AK .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2003, 53 (06) :566-572
[34]   Exploiting heterogeneous sequence properties improves prediction of protein disorder [J].
Obradovic, Z ;
Peng, K ;
Vucetic, S ;
Radivojac, P ;
Dunker, AK .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 :176-182
[35]   Addressing the intrinsic disorder bottleneck in structural proteomics [J].
Oldfield, CJ ;
Ulrich, EL ;
Cheng, YG ;
Dunker, AK ;
Markley, JL .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 59 (03) :444-453
[36]   Comparing and combining predictors of mostly disordered proteins [J].
Oldfield, CJ ;
Cheng, Y ;
Cortese, MS ;
Brown, CJ ;
Uversky, VN ;
Dunker, AK .
BIOCHEMISTRY, 2005, 44 (06) :1989-2000
[37]  
Peng K, 2006, BMC BIOINFORMATICS, V7, DOI 10.1186/1471-2105-7-208
[38]   FoldIndex©:: a simple tool to predict whether a given protein sequence is intrinsically unfolded [J].
Prilusky, J ;
Felder, CE ;
Zeev-Ben-Mordehai, T ;
Rydberg, EH ;
Man, O ;
Beckmann, JS ;
Silman, I ;
Sussman, JL .
BIOINFORMATICS, 2005, 21 (16) :3435-3438
[39]   Protein flexibility and intrinsic disorder [J].
Radivojac, P ;
Obradovic, Z ;
Smith, DK ;
Zhu, G ;
Vucetic, S ;
Brown, CJ ;
Lawson, JD ;
Dunker, AK .
PROTEIN SCIENCE, 2004, 13 (01) :71-80
[40]   Intrinsic disorder and functional proteomics [J].
Radivojac, Predrag ;
Iakoucheva, Lilia M. ;
Oldfield, Christopher J. ;
Obradovic, Zoran ;
Uversky, Vladimir N. ;
Dunker, A. Keith .
BIOPHYSICAL JOURNAL, 2007, 92 (05) :1439-1456