domain architecture;
protein sequence;
protein structure;
structural genomics;
STRUCTURE PREDICTION;
ALPHA-LACTALBUMIN;
FAMILIES;
SEQUENCES;
DOMAINS;
BIOLOGY;
TASSER;
PFAM;
D O I:
10.1073/pnas.0905029106
中图分类号:
O [数理科学和化学];
P [天文学、地球科学];
Q [生物科学];
N [自然科学总论];
学科分类号:
07 ;
0710 ;
09 ;
摘要:
The protein universe is the set of all proteins of all organisms. Here, all currently known sequences are analyzed in terms of families that have single-domain or multidomain architectures and whether they have a known three-dimensional structure. Growth of new single-domain families is very slow: Almost all growth comes from new multidomain architectures that are combinations of domains characterized by approximate to 15,000 sequence profiles. Single-domain families are mostly shared by the major groups of organisms, whereas multidomain architectures are specific and account for species diversity. There are known structures for a quarter of the single-domain families, and > 70% of all sequences can be partially modeled thanks to their membership in these families.
机构:Univ Calif Berkeley, Lawrence Berkeley Lab, Berkeley Struct Genom Ctr, Phys Biosci Div, Berkeley, CA 94720 USA
Chandonia, JM
;
Brenner, SE
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Lawrence Berkeley Lab, Berkeley Struct Genom Ctr, Phys Biosci Div, Berkeley, CA 94720 USAUniv Calif Berkeley, Lawrence Berkeley Lab, Berkeley Struct Genom Ctr, Phys Biosci Div, Berkeley, CA 94720 USA
机构:Univ Calif Berkeley, Lawrence Berkeley Lab, Berkeley Struct Genom Ctr, Phys Biosci Div, Berkeley, CA 94720 USA
Chandonia, JM
;
Brenner, SE
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Lawrence Berkeley Lab, Berkeley Struct Genom Ctr, Phys Biosci Div, Berkeley, CA 94720 USAUniv Calif Berkeley, Lawrence Berkeley Lab, Berkeley Struct Genom Ctr, Phys Biosci Div, Berkeley, CA 94720 USA