The SUPERFAMILY database in 2007: families and functions

被引:175
作者
Wilson, Derek
Madera, Martin
Vogel, Christine
Chothia, Cyrus
Gough, Julian
机构
[1] MRC, Mol Biol Lab, Cambridge CB2 2QH, England
[2] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
[3] Univ Texas, Inst Cellular & Mol Biol, Austin, TX 78712 USA
[4] Inst Pasteur, Unite Bioinformat Struct, F-75724 Paris, France
基金
英国医学研究理事会;
关键词
D O I
10.1093/nar/gkl910
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The SUPERFAMILY database provides protein domain assignments, at the SCOP 'superfamily' level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and sequence data. SUPERFAMILY domain assignments are generated using an expert curated set of profile hidden Markov models. All models and structural assignments are available for browsing and download from http://supfam.org.. The web interface includes services such as domain architectures and alignment details for all protein assignments, searchable domain combinations, domain occurrence network visualization, detection of over- or under-represented superfamilies for a given genome by comparison with other genomes, assignment of manually submitted sequences and keyword searches. In this update we describe the SUPERFAMILY database and outline two major developments: (i) incorporation of family level assignments and (ii) a superfamily-level functional annotation. The SUPERFAMILY database can be used for general protein evolution and superfamily-specific studies, genomic annotation, and structural genomics target suggestion and assessment.
引用
收藏
页码:D308 / D313
页数:6
相关论文
共 24 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   CNplot: visualizing pre-clustered networks [J].
Batada, NN .
BIOINFORMATICS, 2004, 20 (09) :1455-1456
[5]   Ensembl 2006 [J].
Birney, E. ;
Andrews, D. ;
Caccamo, M. ;
Chen, Y. ;
Clarke, L. ;
Coates, G. ;
Cox, T. ;
Cunningham, F. ;
Curwen, V. ;
Cutts, T. ;
Down, T. ;
Durbin, R. ;
Fernandez-Suarez, X. M. ;
Flicek, P. ;
Graf, S. ;
Hammond, M. ;
Herrero, J. ;
Howe, K. ;
Iyer, V. ;
Jekosch, K. ;
Kahari, A. ;
Kasprzyk, A. ;
Keefe, D. ;
Kokocinski, F. ;
Kulesha, E. ;
London, D. ;
Longden, I. ;
Melsopp, C. ;
Meidl, P. ;
Overduin, B. ;
Parker, A. ;
Proctor, G. ;
Prlic, A. ;
Rae, M. ;
Rios, D. ;
Redmond, S. ;
Schuster, M. ;
Sealy, I. ;
Searle, S. ;
Severin, J. ;
Slater, G. ;
Smedley, D. ;
Smith, J. ;
Stabenau, A. ;
Stalker, J. ;
Trevanion, S. ;
Ureta-Vidal, A. ;
Vogel, J. ;
White, S. ;
Woodwark, C. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D556-D561
[6]  
Deshpande N, 2005, NUCLEIC ACIDS RES, V33, pD233
[7]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[8]   Pfam:: clans, web tools and services [J].
Finn, Robert D. ;
Mistry, Jaina ;
Schuster-Bockler, Benjamin ;
Griffiths-Jones, Sam ;
Hollich, Volker ;
Lassmann, Timo ;
Moxon, Simon ;
Marshall, Mhairi ;
Khanna, Ajay ;
Durbin, Richard ;
Eddy, Sean R. ;
Sonnhammer, Erik L. L. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D247-D251
[9]  
Gansner ER, 2000, SOFTWARE PRACT EXPER, V30, P1203, DOI 10.1002/1097-024X(200009)30:11<1203::AID-SPE338>3.0.CO
[10]  
2-N