DNdisorder: predicting protein disorder using boosting and deep networks

被引:62
作者
Eickholt, Jesse [1 ]
Cheng, Jianlin [1 ,2 ,3 ]
机构
[1] Univ Missouri, Dept Comp Sci, Columbia, MO 65211 USA
[2] Univ Missouri, Inst Informat, Columbia, MO 65211 USA
[3] Univ Missouri, C Bond Life Sci Ctr, Columbia, MO 65211 USA
来源
BMC BIOINFORMATICS | 2013年 / 14卷
关键词
Protein disorder prediction; Disordered regions; Deep networks; Deep learning; NATIVELY UNFOLDED PROTEINS; INTRINSIC DISORDER; UNSTRUCTURED REGIONS; ACCURATE PREDICTION; WEB SERVER; DEFINITION;
D O I
10.1186/1471-2105-14-88
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: A number of proteins contain regions which do not adopt a stable tertiary structure in their native state. Such regions known as disordered regions have been shown to participate in many vital cell functions and are increasingly being examined as drug targets. Results: This work presents a new sequence based approach for the prediction of protein disorder. The method uses boosted ensembles of deep networks to make predictions and participated in the CASP10 experiment. In a 10 fold cross validation procedure on a dataset of 723 proteins, the method achieved an average balanced accuracy of 0.82 and an area under the ROC curve of 0.90. These results are achieved in part by a boosting procedure which is able to steadily increase balanced accuracy and the area under the ROC curve over several rounds. The method also compared competitively when evaluated against a number of state-of-the-art disorder predictors on CASP9 and CASP10 benchmark datasets. Conclusions: DNdisorder is available as a web service at http://iris.rnet.missouri.edu/dndisorder/.
引用
收藏
页数:10
相关论文
共 42 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[3]   SCRATCH: a protein structure and structural feature prediction server [J].
Cheng, J ;
Randall, AZ ;
Sweredoski, MJ ;
Baldi, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W72-W76
[4]   Accurate prediction of protein disordered regions by mining protein structure data [J].
Cheng, JL ;
Sweredoski, MJ ;
Baldi, P .
DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (03) :213-222
[5]   Rational drug design via intrinsically disordered protein [J].
Cheng, Yugong ;
LeGall, Tanguy ;
Oldfield, Christopher J. ;
Mueller, James P. ;
Van, Ya-Yue J. ;
Romero, Pedro ;
Cortese, Marc S. ;
Uversky, Vladimir N. ;
Dunker, A. Keith .
TRENDS IN BIOTECHNOLOGY, 2006, 24 (10) :435-442
[6]   A comprehensive overview of computational protein disorder prediction methods [J].
Deng, Xin ;
Eickholt, Jesse ;
Cheng, Jianlin .
MOLECULAR BIOSYSTEMS, 2012, 8 (01) :114-121
[7]   PreDisorder: ab initio sequence-based prediction of protein disordered regions [J].
Deng, Xin ;
Eickholt, Jesse ;
Cheng, Jianlin .
BMC BIOINFORMATICS, 2009, 10
[8]   IUPred:: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content [J].
Dosztányi, Z ;
Csizmok, V ;
Tompa, P ;
Simon, I .
BIOINFORMATICS, 2005, 21 (16) :3433-3434
[9]   The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins [J].
Dosztányi, Z ;
Csizmók, V ;
Tompa, P ;
Simon, I .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 347 (04) :827-839
[10]   Drugs for 'protein clouds': targeting intrinsically disordered transcription factors [J].
Dunker, A. Keith ;
Uversky, Vladimir N. .
CURRENT OPINION IN PHARMACOLOGY, 2010, 10 (06) :782-788