Protein disorder prediction at multiple levels of sensitivity and specificity

被引:39
作者
Hecker, Joshua [3 ]
Yang, Jack Y. [4 ]
Cheng, Jianlin [1 ,2 ]
机构
[1] Univ Missouri, Dept Comp Sci, Columbia, MO 65211 USA
[2] Univ Missouri, Inst Informat, Columbia, MO 65211 USA
[3] Univ Cent Florida, Sch Elect Engn & Comp Sci, Orlando, FL 32816 USA
[4] Harvard Univ, Cambridge, MA 02140 USA
关键词
Protein Data Bank; Decision Threshold; Protein Structure Prediction; Protein Disorder; Disorder Region;
D O I
10.1186/1471-2164-9-S1-S9
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Many protein regions and some entire proteins have no definite tertiary structure, existing instead as dynamic, disorder ensembles under different physiochemical circumstances. Identification of these protein disorder regions is important for protein production, protein structure prediction and determination, and protein function annotation. A number of different disorder prediction software and web services have been developed since the first predictor was designed by Dunker's lab in 1997. However, most of the software packages use a pre-defined threshold to select ordered or disordered residues. In many situations, users need to choose ordered or disordered residues at different sensitivity and specificity levels. Results: Here we benchmark a state of the art disorder predictor, DISpro, on a large protein disorder dataset created from Protein Data Bank and systematically evaluate the relationship of sensitivity and specificity. Also, we extend its functionality to allow users to trade off specificity and sensitivity by setting different decision thresholds. Moreover, we compare DISpro with seven other automated disorder predictors on the 95 protein targets used in the seventh edition of Critical Assessment of Techniques for Protein Structure Prediction (CASP7). DISpro is ranked as one of the best predictors. Conclusion: The evaluation and extension of DISpro make it a more valuable and useful tool for structural and functional genomics.
引用
收藏
页数:7
相关论文
共 23 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2004, J MACH LEARN RES, DOI DOI 10.1162/153244304773936054
[3]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[4]   Assessment of disorder predictions in CASP7 [J].
Bordoli, Lorenza ;
Kiefer, Florian ;
Schwede, Torsten .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 :129-136
[5]   SCRATCH: a protein structure and structural feature prediction server [J].
Cheng, J ;
Randall, AZ ;
Sweredoski, MJ ;
Baldi, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W72-W76
[6]   Accurate prediction of protein disordered regions by mining protein structure data [J].
Cheng, JL ;
Sweredoski, MJ ;
Baldi, P .
DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (03) :213-222
[7]   Prediction of unfolded segments in a protein sequence based on amino acid composition [J].
Coeytaux, K ;
Poupon, A .
BIOINFORMATICS, 2005, 21 (09) :1891-1900
[8]   Intrinsic disorder and protein function [J].
Dunker, AK ;
Brown, CJ ;
Lawson, JD ;
Iakoucheva, LM ;
Obradovic, Z .
BIOCHEMISTRY, 2002, 41 (21) :6573-6582
[9]   Assessment of disorder predictions in CASP6 [J].
Jin, YM ;
Dunbrack, RL .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 :167-175
[10]  
MacCallum RM, CASP6