MODBASE, a database of annotated comparative protein structure models

被引:81
作者
Pieper, U [1 ]
Eswar, N [1 ]
Stuart, AC [1 ]
Ilyin, VA [1 ]
Sali, A [1 ]
机构
[1] Rockefeller Univ, Pels Family Ctr Biochem & Struct Biol, Labs Mol Biophys, New York, NY 10021 USA
关键词
D O I
10.1093/nar/30.1.255
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10(-4)) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server.
引用
收藏
页码:255 / 259
页数:5
相关论文
共 49 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[3]   Protein structure prediction and structural genomics [J].
Baker, D ;
Sali, A .
SCIENCE, 2001, 294 (5540) :93-96
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[5]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   Structural genomics of enzymes involved in sterol/isoprenoid biosynthesis [J].
Bonanno, JB ;
Edo, C ;
Eswar, N ;
Pieper, U ;
Romanowski, MJ ;
Ilyin, V ;
Gerchman, SE ;
Kycia, H ;
Studier, FW ;
Sali, A ;
Burley, SK .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (23) :12896-12901
[8]   Ab initio protein structure prediction: Progress and prospects [J].
Bonneau, R ;
Baker, D .
ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE, 2001, 30 :173-189
[9]   The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues [J].
Bray, JE ;
Todd, AE ;
Pearl, FMG ;
Thornton, JM ;
Orengo, CA .
PROTEIN ENGINEERING, 2000, 13 (03) :153-165
[10]  
Brenner SE, 2000, PROTEIN SCI, V9, P197