Clustering of protein structural fragments reveals modular building block approach of nature

被引:28
作者
Tendulkar, AV
Joshi, AA
Sohoni, MA
Wangikar, PP [1 ]
机构
[1] Indian Inst Technol, Kanwal Rekhi Sch Informat Technol, Bombay 400076, Maharashtra, India
[2] Indian Inst Technol, Dept Chem Engn, Bombay 400076, Maharashtra, India
[3] Indian Inst Technol, Dept Comp Sci & Engn, Bombay 400076, Maharashtra, India
关键词
geometric invariants; protein structure comparison; secondary structure; loop;
D O I
10.1016/j.jmb.2004.02.047
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Structures of peptide fragments drawn from a protein can potentially occupy a vast conformational continuum. We co-ordinatize this conformational space with the help of geometric invariants and demonstrate that the peptide conformations of the currently available protein structures are heavily biased in favor of a finite number of conformational types or structural building blocks. This is achieved by representing a peptides' backbone structure with geometric invariants and then clustering peptides based on closeness of the geometric invariants. This results in 12,903 clusters, of which 2207 are made up of peptides drawn from functionally and/or structurally related proteins. These are termed "functional" clusters and provide clues about potential functional sites. The rest of the clusters, including the largest few, are made up of peptides drawn from unrelated proteins and are termed "structural" clusters. The largest clusters are of regular secondary structures such as helices and beta strands as well as of beta hairpins. Several categories of helices and strands are discovered based on geometric differences. In addition to the known classes of loops, we discover several new classes, which will be useful in protein structure modeling. Our algorithm does not require assignment of secondary structure and, therefore, overcomes the limitations in loop classification due to ambiguity in secondary structure assignment at loop boundaries. (C) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:611 / 629
页数:19
相关论文
共 43 条
[1]   Three-dimensional structure of guanylyl cyclase activating protein-2, a calcium-sensitive modulator of photoreceptor guanylyl cyclases [J].
Ames, JB ;
Dizhoor, AM ;
Ikura, M ;
Palczewski, K ;
Stryer, L .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1999, 274 (27) :19329-19337
[2]  
ATTWOOD TK, 1994, NUCLEIC ACIDS RES, V22, P3590
[3]   THE PROSITE DICTIONARY OF SITES AND PATTERNS IN PROTEINS, ITS CURRENT STATUS [J].
BAIROCH, A .
NUCLEIC ACIDS RESEARCH, 1993, 21 (13) :3097-3103
[4]   The PROSITE database, its status in 1997 [J].
Bairoch, A ;
Bucher, P ;
Hofmann, K .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :217-221
[5]   Protein structure prediction and structural genomics [J].
Baker, D ;
Sali, A .
SCIENCE, 2001, 294 (5540) :93-96
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[8]   The tortuous story of Asp ... His ... Ser: structural analysis of alpha-chymotrypsin [J].
Blow, DM .
TRENDS IN BIOCHEMICAL SCIENCES, 1997, 22 (10) :405-408
[9]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[10]   Browsing the SLoop database of structurally classified loops connecting elements of protein secondary structure [J].
Burke, DF ;
Deane, CM ;
Blundell, TL .
BIOINFORMATICS, 2000, 16 (06) :513-519