Supra-domains: Evolutionary units larger than single protein domains

被引:127
作者
Vogel, C
Berzuini, C
Bashton, M
Gough, J
Teichmann, SA
机构
[1] MRC, Mol Biol Lab, Cambridge CB2 2QH, England
[2] MRC, Inst Publ Hlth, Biostat Unit, Cambridge CB2 2SR, England
[3] Univ Pavia, Dipartimento Informat & Sistemist, I-27100 Pavia, Italy
[4] RIKEN, Genome Explorat Res Grp, Genom Sci Ctr, Tsurumi Ku, Yokohama, Kanagawa 2300045, Japan
[5] Dept Biol Struct, Stanford, CA 94305 USA
关键词
domain combination; protein family; domain architecture; multi-domain protein; functional annotation;
D O I
10.1016/j.jmb.2003.12.026
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Domains are the evolutionary units that comprise proteins, and most proteins are built from more than one domain. Domains can be shuffled by recombination to create proteins with new arrangements of domains. Using structural domain assignments, we examined the combinations of domains in the proteins of 131 completely sequenced organisms. We found two-domain and three-domain combinations that recur in different protein contexts with different partner domains. The domains within these combinations have a particular functional and spatial relationship. These units are larger than individual domains and we term them "supra-domains". Amongst the supra-domains, we identified some 1400 (1203 two-domain and 166 three-domain) combinations that are statistically significantly over-represented relative to the occurrence and versatility of the individual component domains. Over one-third of all structurally assigned multi-domain proteins contain these over-represented supra-domains. This means that investigation of the structural and functional relationships of the domains forming these popular combinations would be particularly useful for an understanding of multi-domain protein function and evolution as well as for genome annotation. These and other supra-domains were analysed for their versatility, duplication, their distribution across the three kingdoms of life and their functional classes. By examining the three-dimensional structures of several examples of supra-domains in different biological processes, we identify two basic types of spatial relationships between the component domains: the combined function of the two domains is such that either the geometry of the two domains is crucial and there is a tight constraint on the interface, or the precise orientation of the domains is less important and they are spatially separate. Frequently, the role of the supra-domain becomes clear only once the three-dimensional structure is known. Since this is the case for only a quarter of the supra-domains, we provide a list of the most important unknown supra-domains as potential targets for structural genomics projects. (C) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:809 / 823
页数:15
相关论文
共 36 条
[1]   The structure of elongation factor G in complex with GDP: Conformational flexibility and nucleotide exchange [J].
Al-Karadaghi, S ;
AEvarsson, A ;
Garber, M ;
Zheltonosova, J ;
Liljas, A .
STRUCTURE, 1996, 4 (05) :555-565
[2]   Domain combinations in archaeal, eubacterial and eukaryotic proteomes [J].
Apic, G ;
Gough, J ;
Teichmann, SA .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 310 (02) :311-325
[3]  
Apic Gordana, 2003, Journal of Structural and Functional Genomics, V4, P67, DOI 10.1023/A:1026113408773
[4]   The geometry of domain combination in proteins [J].
Bashton, M ;
Chothia, C .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 315 (04) :927-939
[5]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[6]   Crystal structure of rat short chain acyl-CoA dehydrogenase complexed with acetoacetyl-CoA - Comparison with other acyl-CoA dehydrogenases [J].
Battaile, KP ;
Molin-Case, J ;
Paschke, R ;
Wang, M ;
Bennett, D ;
Vockley, J ;
Kim, JJP .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (14) :12200-12207
[7]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[8]   Structure of CheA, a signal-transducing histidine kinase [J].
Bilwes, AM ;
Alex, LA ;
Crane, BR ;
Simon, MI .
CELL, 1999, 96 (01) :131-141
[9]   Enhanced protein domain discovery by using language modeling techniques from speech recognition [J].
Coin, L ;
Bateman, A ;
Durbin, R .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (08) :4516-4520
[10]   X-ray structure of human beta(3)beta(3) alcohol dehydrogenase - The contribution of ionic interactions to coenzyme binding [J].
Davis, GJ ;
Bosron, WF ;
Stone, CL ;
OwusuDekyi, K ;
Hurley, TD .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1996, 271 (29) :17057-17061