iProClass:: an integrated database of protein family, function and structure information

被引:40
作者
Huang, HZ
Barker, WC
Chen, YX
Wu, CH
机构
[1] Georgetown Univ, Med Ctr, Dept Biochem & Mol Biol, Washington, DC 20057 USA
[2] Georgetown Univ, Med Ctr, Natl Biomed Res Fdn, Washington, DC 20057 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/nar/gkg044
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The iProClass database provides comprehensive, value-added descriptions of proteins and serves as a framework for data integration in a distributed networking environment. The protein information in iProClass includes family relationships as well as structural and functional classifications and features. The current version consists of about 830 000 non-redundant PIR-PSD, SWISS-PROT, and TrEMBL proteins organized with more than 36 000 PIR superfamilies, 145 000 families, 4000 domains, 1300 motifs and 550 000 FASTA similarity clusters. It provides rich links to over 50 database of protein sequences, families, functions and pathways, protein protein interactions, post-translational modifications, protein expressions, structures and structural classifications, genes and genomes, ontologies, literature and taxonomy. Protein and superfamily summary reports present extensive annotation information and include membership statistics and graphical display of domains and motifs. iProClass employs an open and modular architecture for interoperability and scalability. It is implemented in the Oracle object-relational database system and is updated biweekly. The database is freely accessible from the web site at http:/ / pir. georgetown. edu/iproclass/ and searchable by sequence or text string. The data integration in iProClass supports exploration of protein relationships. Such knowledge is fundamental to the understanding of protein evolution, structure and function and crucial to functional genomic and proteomic research.
引用
收藏
页码:390 / 392
页数:3
相关论文
共 9 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[3]  
Barker WC, 1996, METHOD ENZYMOL, V266, P59
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[5]   The PROSITE database, its status in 2002 [J].
Falquet, L ;
Pagni, M ;
Bucher, P ;
Hulo, N ;
Sigrist, CJA ;
Hofmann, K ;
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :235-238
[6]   IMPROVED TOOLS FOR BIOLOGICAL SEQUENCE COMPARISON [J].
PEARSON, WR ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (08) :2444-2448
[7]   The Protein Information Resource: an integrated public resource of functional annotation of proteins [J].
Wu, CH ;
Huang, HZ ;
Arminski, L ;
Castro-Alvear, J ;
Chen, YX ;
Hu, ZZ ;
Ledley, RS ;
Lewis, KC ;
Mewes, HW ;
Orcutt, BC ;
Suzek, BE ;
Tsugita, A ;
Vinayaka, CR ;
Yeh, LSL ;
Zhang, J ;
Barker, WC .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :35-37
[8]   iProClass:: an integrated, comprehensive and annotated protein classification database [J].
Wu, CH ;
Xiao, CL ;
Hou, ZL ;
Huang, HZ ;
Barker, WC .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :52-54
[9]  
WU CH, 2003, IN PRESS PRACTICAL B