Inferring function using patterns of native disorder in proteins

被引:101
作者
Lobley, Anna
Swindells, Mark B.
Orengo, Christine A.
Jones, David T. [1 ]
机构
[1] UCL, Dept Comp Sci, Bioinformat Unit, London, England
[2] Inpharm, London, England
[3] UCL, Dept Biochem, Biocomp Grp, London, England
关键词
D O I
10.1371/journal.pcbi.0030162
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Natively unstructured regions are a common feature of eukaryotic proteomes. Between 30% and 60% of proteins are predicted to contain long stretches of disordered residues, and not only have many of these regions been confirmed experimentally, but they have also been found to be essential for protein function. In this study, we directly address the potential contribution of protein disorder in predicting protein function using standard Gene Ontology ( GO) categories. Initially we analyse the occurrence of protein disorder in the human proteome and report ontology categories that are enriched in disordered proteins. Pattern analysis of the distributions of disordered regions in human sequences demonstrated that the functions of intrinsically disordered proteins are both length- and positiondependent. These dependencies were then encoded in feature vectors to quantify the contribution of disorder in human protein function prediction using Support Vector Machine classifiers. The prediction accuracies of 26 GO categories relating to signalling and molecular recognition are improved using the disorder features. The most significant improvements were observed for kinase, phosphorylation, growth factor, and helicase categories. Furthermore, we provide predicted GO term assignments using these classifiers for a set of unannotated and orphan human proteins. In this study, the importance of capturing protein disorder information and its value in function prediction is demonstrated. The GO category classifiers generated can be used to provide more reliable predictions and further insights into the behaviour of orphan and unannotated proteins.
引用
收藏
页码:1567 / 1579
页数:13
相关论文
共 59 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Comparative genomics and disorder prediction identify biologically relevant SH3 protein interactions [J].
Beltrao, P ;
Serrano, L .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (03) :202-211
[4]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[5]   Structural principles for the multispecificity of small GTP-binding proteins [J].
Biou, V ;
Cherfils, J .
BIOCHEMISTRY, 2004, 43 (22) :6833-6840
[6]   Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence [J].
Blom, N ;
Sicheritz-Pontén, T ;
Gupta, R ;
Gammeltoft, S ;
Brunak, S .
PROTEOMICS, 2004, 4 (06) :1633-1649
[7]   The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology [J].
Camon, E ;
Magrane, M ;
Barrell, D ;
Lee, V ;
Dimmer, E ;
Maslen, J ;
Binns, D ;
Harte, N ;
Lopez, R ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D262-D266
[8]   Conservation of intrinsic disorder in protein domains and families: II. Functions of conserved disorder [J].
Chen, JW ;
Romero, P ;
Uversky, VN ;
Dunker, AK .
JOURNAL OF PROTEOME RESEARCH, 2006, 5 (04) :888-898
[9]   Abundance of intrinsic disorder in protein associated with cardiovascular disease [J].
Cheng, Yugong ;
LeGall, Tanguy ;
Oldfield, Christopher J. ;
Dunker, A. Keith ;
Uversky, Vladimir N. .
BIOCHEMISTRY, 2006, 45 (35) :10448-10460
[10]   Predicting enzyme class from protein structure without alignments [J].
Dobson, PD ;
Doig, AJ .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 345 (01) :187-199