Evolution of protein structural classes and protein sequence families

被引:61
作者
Choi, In-Geol
Kim, Sung-Hou [1 ]
机构
[1] Univ Calif Berkeley, Phys Biosci Div, Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Chem, Berkeley, CA 94720 USA
关键词
protein fold classes; common structural ancestor; evolutionary age; protein structure universe; CLASSIFICATION; FOLDS; MODEL; BETA; OLD;
D O I
10.1073/pnas.0606239103
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In protein structure space, protein structures cluster into four elongated regions when mapped based solely on similarity among the 3D structures. These four regions correspond to the four major classes of present-day proteins defined by the contents of secondary structure types and their topological arrangement. Evolution of and restriction to these four classes suggest that, in most cases, the evolution of genes may have been constrained or selected to those genetic changes that results in structurally stable proteins occupying one of the four "allowed" regions of the protein structure space, "structural selection," an important component of natural selection in gene evolution. Our studies on tracing the "common structural ancestor" for each protein sequence family of known structure suggest that: (i) recently emerged proteins belong mostly to three classes; (ii) the proteins that emerged earlier evolved to gain a new class; and (iii) the proteins that emerged earliest evolved to become the present-day proteins in the four major classes, with the fourth-class proteins becoming the most dominant population. Furthermore, our studies also show that not all present-day proteins evolved from one single set of proteins in the last common ancestral organism, but new common ancestral proteins were "born" at different evolutionary times, not traceable to one or two ancestral proteins: "the multiple birth model" for the evolution of protein sequence families.
引用
收藏
页码:14056 / 14061
页数:6
相关论文
共 26 条
[1]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[2]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[3]   Evolution of the protein repertoire [J].
Chothia, C ;
Gough, J ;
Vogel, C ;
Teichmann, SA .
SCIENCE, 2003, 300 (5626) :1701-1703
[4]   Protein folds in the all-beta and all-alpha classes [J].
Chothia, C ;
Hubbard, T ;
Brenner, S ;
Barns, H ;
Murzin, A .
ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE, 1997, 26 :597-627
[5]   A unifold, mesofold, and superfold model of protein fold use [J].
Coulson, AFW ;
Moult, J .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 46 (01) :61-71
[6]  
DAYHOFF MO, 1976, FED PROC, V35, P2132
[7]   Laws of form revisited [J].
Denton, M ;
Marshall, C .
NATURE, 2001, 410 (6827) :417-417
[8]  
Goudet Goudet J. J., FSTAT VERSION 293 PR, DOI DOI 10.1111/J.1096-0031.1989.TB00562.X
[9]   THE COMBINATORIAL DISTANCE GEOMETRY METHOD FOR THE CALCULATION OF MOLECULAR-CONFORMATION .1. A NEW APPROACH TO AN OLD PROBLEM [J].
HAVEL, TF ;
KUNTZ, ID ;
CRIPPEN, GM .
JOURNAL OF THEORETICAL BIOLOGY, 1983, 104 (03) :359-381
[10]  
HAWKSWORTH PM, 1995, GLOBAL BIODIERSITY A