InterPro - an integrated documentation resource for protein families, domains and functional sites

被引:204
作者
Apweiler, R
Attwood, TK
Bairoch, A
Bateman, A
Birney, E
Biswas, M
Bucher, P
Cerutti, L
Corpet, F
Croning, MDR
Durbin, R
Falquet, L
Fleischmann, W
Gouzy, J
Hermjakob, H
Hulo, N
Jonassen, I
Kahn, D
Kanapin, A
Karavidopoulou, Y
Lopez, R
Marx, B
Mulder, NJ
Oinn, TM
Pagni, M
Servant, F
Sigrist, CJA
Zdobnov, EM
机构
[1] European Bioinformat Inst, EMBL Outstn, Cambridge, England
[2] Univ Manchester, Sch Biol Sci, Manchester, Lancs, England
[3] Swiss Inst Bioinformat, Geneva, Switzerland
[4] Sanger Ctr, Cambridge, England
[5] Swiss Inst Expt Canc Res, Lausanne, Switzerland
[6] INRA, CNRS, F-31931 Toulouse, France
[7] Univ Bergen, Dept Informat, HIB, N-5008 Bergen, Norway
关键词
D O I
10.1093/bioinformatics/16.12.1145
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: InterPro is a new integrated documentation resource for protein families, domains and functional sites, developed initially as a means of rationalising the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Results: Merged annotations from PRINTS, PROSITE and Pfam form the InterPro core. Each combined InterPro entry includes functional descriptions and literature references, and links are made back to the relevant parent database(s), allowing users to see at a glance whether a particular family or domain has associated patterns, profiles, fingerprints, etc. Merged and individual entries (i.e. those that have no counterpart in the companion resources) are assigned unique accession numbers. Release 1.2 of InterPro (June 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification (PTMs) encoded by 6581 different regular expressions, profiles, fingerprints and Hidden Markov Models (HMMs). Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1 000 000 hits from 264 333 different proteins out of 384 572 in SWISS-PROT and TrEMBL).
引用
收藏
页码:1145 / 1150
页数:6
相关论文
共 14 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   PRINTS-S: the database formerly known as PRINTS [J].
Attwood, TK ;
Croning, MDR ;
Flower, DR ;
Lewis, AP ;
Mabey, JE ;
Scordis, P ;
Selley, JN ;
Wright, W .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :225-227
[3]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[5]   ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons [J].
Corpet, F ;
Servant, F ;
Gouzy, J ;
Kahn, D .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :267-269
[6]  
Etzold T, 1996, METHOD ENZYMOL, V266, P114
[7]   A novel method for automatic functional annotation of proteins [J].
Fleischmann, W ;
Möller, S ;
Gateau, A ;
Apweiler, R .
BIOINFORMATICS, 1999, 15 (03) :228-233
[8]   Increased coverage of protein families with the Blocks Database servers [J].
Henikoff, JG ;
Greene, EA ;
Pietrokovski, S ;
Henikoff, S .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :228-230
[9]   The PROSITE database, its status in 1999 [J].
Hofmann, K ;
Bucher, P ;
Falquet, L ;
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :215-219
[10]  
Jonassen I, 1997, COMPUT APPL BIOSCI, V13, P509