Reduced amino acid alphabet is sufficient to accurately recognize intrinsically disordered protein

被引:109
作者
Weathers, EA
Paulaitis, ME
Woolf, TB
Hoh, JH
机构
[1] Johns Hopkins Univ, Dept Chem & Biomol Engn, Baltimore, MD 21218 USA
[2] Johns Hopkins Sch Med, Dept Physiol, Baltimore, MD 21205 USA
[3] Johns Hopkins Univ, Dept Biophys, Baltimore, MD 21205 USA
[4] Johns Hopkins Sch Med, Dept Biophys & Biophys Chem, Baltimore, MD 21218 USA
来源
FEBS LETTERS | 2004年 / 576卷 / 03期
关键词
unstructured protein; support vector machine; amino acid composition; protein classification; sequence complexity;
D O I
10.1016/j.febslet.2004.09.036
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Intrinsically disordered proteins are an important class of proteins with unique functions and properties. Here, we have applied a support vector machine (SVM) trained on naturally occurring disordered and ordered proteins to examine the contribution of various parameters (vectors) to recognizing proteins that contain disordered regions. We find that a SVM that incorporates only amino acid composition has a recognition accuracy of 87 +/- 2%. This result suggests that composition alone is sufficient to accurately recognize disorder. Interestingly, SVMs using reduced sets of amino acids based on chemical similarity preserve high recognition accuracy. A set as small as four retains an accuracy of 84 +/- 2%; this suggests that general physicochemical properties rather than specific amino acids are important factors contributing to protein disorder. (C) 2004 Published by Elsevier B.V. on behalf of the Federation of European Biochemical Societies.
引用
收藏
页码:348 / 352
页数:5
相关论文
共 34 条
[1]  
ANDORF CM, 2003, IN PRESS INFORM SCI
[2]   Predicting properties of intrinsically unstructured proteins [J].
Bright, JN ;
Woolf, TB ;
Hoh, JH .
PROGRESS IN BIOPHYSICS & MOLECULAR BIOLOGY, 2001, 76 (03) :131-173
[3]   Entropic exclusion by neurofilament sidearms: A mechanism for maintaining interfilament spacing [J].
Brown, HG ;
Hoh, JH .
BIOCHEMISTRY, 1997, 36 (49) :15035-15040
[4]   Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect [J].
Cai, YD ;
Liu, XJ ;
Xu, XB ;
Chou, KC .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2002, 84 (02) :343-348
[5]  
De Gennes PG., 1979, SCALING CONCEPTS POL
[6]   DOMINANT FORCES IN PROTEIN FOLDING [J].
DILL, KA .
BIOCHEMISTRY, 1990, 29 (31) :7133-7155
[7]   Intrinsic disorder and protein function [J].
Dunker, AK ;
Brown, CJ ;
Lawson, JD ;
Iakoucheva, LM ;
Obradovic, Z .
BIOCHEMISTRY, 2002, 41 (21) :6573-6582
[8]   Intrinsically disordered protein [J].
Dunker, AK ;
Lawson, JD ;
Brown, CJ ;
Williams, RM ;
Romero, P ;
Oh, JS ;
Oldfield, CJ ;
Campen, AM ;
Ratliff, CR ;
Hipps, KW ;
Ausio, J ;
Nissen, MS ;
Reeves, R ;
Kang, CH ;
Kissinger, CR ;
Bailey, RW ;
Griswold, MD ;
Chiu, M ;
Garner, EC ;
Obradovic, Z .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2001, 19 (01) :26-59
[9]   Coupling of folding and binding for unstructured proteins [J].
Dyson, HJ ;
Wright, PE .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2002, 12 (01) :54-60
[10]  
Flory P J., PRINCIPLES POLYM CHE