PASS2: an automated database of protein alignments organised as structural superfamilies

被引:31
作者
Bhaduri, A [1 ]
Pugalenthi, G [1 ]
Sowdhamini, R [1 ]
机构
[1] Tata Inst Fundamental Res, Natl Ctr Biol Sci, Bangalore 560065, Karnataka, India
关键词
D O I
10.1186/1471-2105-5-35
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. Description: An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. Conclusions: The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/similar tofaculty/mini/campass/pass2.html.
引用
收藏
页数:7
相关论文
共 33 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[3]   Conserved spatially interacting motifs of protein superfamilies: Application to fold recognition and function annotation of genome data [J].
Bhaduri, A ;
Ravishankar, R ;
Sowdhamini, R .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 54 (04) :657-670
[4]   INSULIN-LIKE GROWTH-FACTOR - MODEL FOR TERTIARY STRUCTURE ACCOUNTING FOR IMMUNOREACTIVITY AND RECEPTOR-BINDING [J].
BLUNDELL, TL ;
BEDARKAR, S ;
RINDERKNECHT, E ;
HUMBEL, RE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1978, 75 (01) :180-184
[5]   SMoS: a database of structural motifs of protein superfamilies [J].
Chakrabarti, S ;
Venkatramanan, K ;
Sowdhamini, R .
PROTEIN ENGINEERING, 2003, 16 (11) :791-793
[6]   Multiple sequence alignment with the Clustal series of programs [J].
Chenna, R ;
Sugawara, H ;
Koike, T ;
Lopez, R ;
Gibson, TJ ;
Higgins, DG ;
Thompson, JD .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3497-3500
[7]  
CHOTHIA C, 1984, ANNU REV BIOCHEM, V53, P537
[8]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[9]   Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure [J].
Gough, J ;
Karplus, K ;
Hughey, R ;
Chothia, C .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 313 (04) :903-919
[10]  
HOLM L, 1994, NUCLEIC ACIDS RES, V22, P3600