The SUPERFAMILY database in 2007: families and functions

被引:175
作者
Wilson, Derek
Madera, Martin
Vogel, Christine
Chothia, Cyrus
Gough, Julian
机构
[1] MRC, Mol Biol Lab, Cambridge CB2 2QH, England
[2] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
[3] Univ Texas, Inst Cellular & Mol Biol, Austin, TX 78712 USA
[4] Inst Pasteur, Unite Bioinformat Struct, F-75724 Paris, France
基金
英国医学研究理事会;
关键词
D O I
10.1093/nar/gkl910
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The SUPERFAMILY database provides protein domain assignments, at the SCOP 'superfamily' level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and sequence data. SUPERFAMILY domain assignments are generated using an expert curated set of profile hidden Markov models. All models and structural assignments are available for browsing and download from http://supfam.org.. The web interface includes services such as domain architectures and alignment details for all protein assignments, searchable domain combinations, domain occurrence network visualization, detection of over- or under-represented superfamilies for a given genome by comparison with other genomes, assignment of manually submitted sequences and keyword searches. In this update we describe the SUPERFAMILY database and outline two major developments: (i) incorporation of family level assignments and (ii) a superfamily-level functional annotation. The SUPERFAMILY database can be used for general protein evolution and superfamily-specific studies, genomic annotation, and structural genomics target suggestion and assessment.
引用
收藏
页码:D308 / D313
页数:6
相关论文
共 24 条
[11]   Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure [J].
Gough, J ;
Karplus, K ;
Hughey, R ;
Chothia, C .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 313 (04) :903-919
[12]   Genomic scale sub-family assignment of protein domains [J].
Gough, Julian .
NUCLEIC ACIDS RESEARCH, 2006, 34 (13) :3625-3633
[13]   Hidden Markov models for detecting remote protein homologies [J].
Karplus, K ;
Barrett, C ;
Hughey, R .
BIOINFORMATICS, 1998, 14 (10) :846-856
[14]   DBD: a transcription factor prediction database [J].
Kummerfeld, Sarah K. ;
Teichmann, Sarah A. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D74-D81
[15]   The SUPERFAMILY database in 2004: additions and improvements [J].
Madera, M ;
Vogel, C ;
Kummerfeld, SK ;
Chothia, C ;
Gough, J .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D235-D239
[16]   A comparison of profile hidden Markov model procedures for remote homology detection [J].
Madera, M ;
Gough, J .
NUCLEIC ACIDS RESEARCH, 2002, 30 (19) :4321-4328
[17]   InterPro, progress and status in 2005 [J].
Mulder, NJ ;
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Binns, D ;
Bradley, P ;
Bork, P ;
Bucher, P ;
Cerutti, L ;
Copley, R ;
Courcelle, E ;
Das, U ;
Durbin, R ;
Fleischmann, W ;
Gough, J ;
Haft, D ;
Harte, N ;
Hulo, N ;
Kahn, D ;
Kanapin, A ;
Krestyaninova, M ;
Lonsdale, D ;
Lopez, R ;
Letunic, I ;
Madera, M ;
Maslen, J ;
McDowall, J ;
Mitchell, A ;
Nikolskaya, AN ;
Orchard, S ;
Pagni, M ;
Pointing, CP ;
Quevillon, E ;
Selengut, J ;
Sigrist, CJA ;
Silventoinen, V ;
Studholme, DJ ;
Vaughan, R ;
Wu, CH .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D201-D205
[18]   The COG database: an updated version includes eukaryotes [J].
Tatusov, RL ;
Fedorova, ND ;
Jackson, JD ;
Jacobs, AR ;
Kiryutin, B ;
Koonin, EV ;
Krylov, DM ;
Mazumder, R ;
Mekhedov, SL ;
Nikolskaya, AN ;
Rao, BS ;
Smirnov, S ;
Sverdlov, AV ;
Vasudevan, S ;
Wolf, YI ;
Yin, JJ ;
Natale, DA .
BMC BIOINFORMATICS, 2003, 4 (1)
[19]   The relationship between domain duplication and recombination [J].
Vogel, C ;
Teichmann, SA ;
Pereira-Leal, J .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 346 (01) :355-365
[20]   Supra-domains: Evolutionary units larger than single protein domains [J].
Vogel, C ;
Berzuini, C ;
Bashton, M ;
Gough, J ;
Teichmann, SA .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 336 (03) :809-823