Flavors of protein disorder

被引:294
作者
Vucetic, S
Brown, CJ
Dunker, AK
Obradovic, Z
机构
[1] Temple Univ, Ctr Informat Sci & Technol, Philadelphia, PA 19122 USA
[2] Washington State Univ, Sch Mol Biosci, Pullman, WA 99164 USA
来源
PROTEINS-STRUCTURE FUNCTION AND GENETICS | 2003年 / 52卷 / 04期
关键词
secondary structure; structure prediction; genomics; sequence composition; signaling and regulatory proteins; unfolded proteins;
D O I
10.1002/prot.10437
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Intrinsically disordered proteins are characterized by long regions lacking 3-D structure in their native states, yet they have been so far associated with 28 distinguishable functions. Previous studies showed that protein predictors trained on disorder from one type of protein often achieve poor accuracy on disorder of proteins of a different type, thus indicating significant differences in sequence properties among disordered proteins. Important biological problems are identifying different types, or flavors, of disorder and examining their relationships with protein function. Innovative use of computational methods is needed in addressing these problems due to relative scarcity of experimental data and background knowledge related to protein disorder. We developed an algorithm that partitions protein disorder into flavors based on competition among increasing numbers of predictors, with prediction accuracy determining both the number of distinct predictors and the partitioning of the individual proteins. Using 145 variously characterized proteins with long (>30 amino acids) disordered regions, 3 flavors, called V, C, and S, were identified by this approach, with the V subset containing 52 segments and 7743 residues, C containing 39 segments and 3402 residues, and S containing 54 segments and 5752 residues. The V, C, and S flavors were distinguishable by amino acid compositions, sequence locations, and biological function. For the sequences in SwissProt and 28 genomes, their protein functions exhibit correlations with the commonness and usage of different disorder flavors, suggesting different flavor-function sets across these protein groups. Overall, the results herein support the flavor-function approach as a useful complement to structural genomics as a means for automatically assigning possible functions to sequences. (C) 2003 Wiley-Liss, Inc.
引用
收藏
页码:573 / 584
页数:12
相关论文
共 53 条
[1]  
Bahar I, 1997, PROTEINS, V29, P172, DOI 10.1002/(SICI)1097-0134(199710)29:2<172::AID-PROT5>3.0.CO
[2]  
2-F
[3]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :49-54
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]  
Bishop CM, 1995, NEURAL NETWORKS PATT, Vxvii
[6]   CONFORMATIONAL PARAMETERS FOR AMINO-ACIDS IN HELICAL, BETA-SHEET, AND RANDOM COIL REGIONS CALCULATED FROM PROTEINS [J].
CHOU, PY ;
FASMAN, GD .
BIOCHEMISTRY, 1974, 13 (02) :211-222
[7]   Distribution of molecular size within an unfolded state ensemble using small-angle X-ray scattering and pulse field gradient NMR techniques [J].
Choy, WY ;
Mulder, FAA ;
Crowhurst, KA ;
Muhandiram, DR ;
Millett, IS ;
Doniach, S ;
Forman-Kay, JD ;
Kay, LE .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 316 (01) :101-112
[8]  
Cybenko G., 1989, Mathematics of Control, Signals, and Systems, V2, P303, DOI 10.1007/BF02551274
[9]  
DAVIDSON R, 1993, ESTIMATION INFERENCE, V20
[10]  
Demchenko AP, 2001, J MOL RECOGNIT, V14, P42, DOI 10.1002/1099-1352(200101/02)14:1<42::AID-JMR518>3.0.CO