Determining the minimum number of types necessary to represent the sizes of protein atoms

被引:14
作者
Tsai, J
Voss, N
Gerstein, M
机构
[1] Yale Univ, Bass Ctr, Dept Biochem & Mol Biophys, New Haven, CT 06520 USA
[2] Texas A&M Univ, Dept Biochem & Biophys, College Stn, TX 77843 USA
关键词
D O I
10.1093/bioinformatics/17.10.949
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Traditionally, for packing calculations people have collected atoms together into a number of distinct 'types'. These, in fact, often represent a heavy atom and its associated hydrogens (i.e. a united atom). Also, atom typing is usually done according to basic chemistry, giving rise to 20-30 protein atom types, such as carbonyl carbons, methyl groups, and hydroxyl groups. No one has yet investigated how similar in packing these chemically derived types are. Here we address this question in detail, using Voronoi volume calculations on a set of high-resolution crystal structures. Results: We perform a rigorous clustering analysis with cross-validation on tens of thousands of atom volumes and attempt to compile them into types based purely on packing. From our analysis, we are able to determine a 'minimal' set of 18 atom types that most efficiently represent the spectrum of packing in proteins. Furthermore, we are able to uncover a number of inconsistencies in traditional chemical typing schemes, where differently typed atoms have almost the same effective size. In particular, we find that tetrahedral carbons with two hydrogens are almost identical in size to many aromatic carbons with a single hydrogen.
引用
收藏
页码:949 / 956
页数:8
相关论文
共 28 条
[1]   Protein data bank archives of three-dimensional macromolecular structures [J].
Abola, EE ;
Sussman, JL ;
Prilusky, J ;
Manning, NO .
MACROMOLECULAR CRYSTALLOGRAPHY, PT B, 1997, 277 :556-571
[2]  
BERNAL JD, 1967, DISCUSS FARADAY SOC, P62
[3]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[4]   VAN DER WAALS VOLUMES + RADII [J].
BONDI, A .
JOURNAL OF PHYSICAL CHEMISTRY, 1964, 68 (03) :441-+
[5]   STRUCTURAL INVARIANTS IN PROTEIN FOLDING [J].
CHOTHIA, C .
NATURE, 1975, 254 (5498) :304-308
[6]   HYDROPHOBIC BONDING AND ACCESSIBLE SURFACE-AREA IN PROTEINS [J].
CHOTHIA, C .
NATURE, 1974, 248 (5446) :338-339
[7]   VORONOI POLYHEDRA AS STRUCTURE PROBES IN LARGE MOLECULAR-SYSTEMS [J].
DAVID, CW .
BIOPOLYMERS, 1988, 27 (02) :339-344
[8]   STATISTICAL-DATA ANALYSIS IN THE COMPUTER-AGE [J].
EFRON, B ;
TIBSHIRANI, R .
SCIENCE, 1991, 253 (5018) :390-395
[10]   VOLUME OCCUPATION, ENVIRONMENT AND ACCESSIBILITY IN PROTEINS - PROBLEM OF PROTEIN SURFACE [J].
FINNEY, JL .
JOURNAL OF MOLECULAR BIOLOGY, 1975, 96 (04) :721-732