EVEREST: a collection of evolutionary conserved protein domains

被引:14
作者
Portugaly, Elon [1 ]
Linial, Nathan
Linial, Michal
机构
[1] Hebrew Univ Jerusalem, Sch Comp Sci & Engn, IL-91905 Jerusalem, Israel
[2] Hebrew Univ Jerusalem, Dept Biol Chem, Inst Life Sci, IL-91905 Jerusalem, Israel
关键词
D O I
10.1093/nar/gkl850
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein domains are subunits of proteins that recur throughout the protein world. There are many definitions attempting to capture the essence of a protein domain, and several systems that identify protein domains and classify them into families. EVEREST, recently described in Portugaly et al. (2006) BMC Bioinformatics, 7, 277, is one such system that performs the task automatically, using protein sequence alone. Herein we describe EVEREST release 2.0, consisting of 20 029 families, each defined by one or more HMMs. The current EVEREST database was constructed by scanning UniProt 8.1 and all PDB sequences (total over 3 000 000 sequences) with each of the EVEREST families. EVEREST annotates 64% of all sequences, and covers 59% of all residues. EVEREST is available at http://www.everest.cs.huji.ac.il. The website provides annotations given by SCOP, CATH, Pfam A and EVEREST. It allows for browsing through the families of each of those sources, graphically visualizing the domain organization of the proteins in the family. The website also provides access to analyzes of relationships between domain families, within and across domain definition systems. Users can upload sequences for analysis by the set of EVEREST families. Finally an advanced search form allows querying for families matching criteria regarding novelty, phylogenetic composition and more.
引用
收藏
页码:D241 / D246
页数:6
相关论文
共 16 条
[1]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[2]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[3]   ASTRAL compendium enhancements [J].
Chandonia, JM ;
Walker, NS ;
Conte, LL ;
Koehl, P ;
Levitt, M ;
Brenner, SE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :260-263
[4]   Pfam:: clans, web tools and services [J].
Finn, Robert D. ;
Mistry, Jaina ;
Schuster-Bockler, Benjamin ;
Griffiths-Jones, Sam ;
Hollich, Volker ;
Lassmann, Timo ;
Moxon, Simon ;
Marshall, Mhairi ;
Khanna, Ajay ;
Durbin, Richard ;
Eddy, Sean R. ;
Sonnhammer, Erik L. L. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D247-D251
[5]   Exhaustive enumeration of protein domain families [J].
Heger, A ;
Holm, L .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 328 (03) :749-767
[6]  
Henikoff S, 1995, Biotechnol Annu Rev, V1, P129, DOI 10.1016/S1387-2656(08)70050-4
[7]   Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations [J].
Henikoff, S ;
Henikoff, JG ;
Pietrokovski, S .
BIOINFORMATICS, 1999, 15 (06) :471-479
[8]   SCOP: a structural classification of proteins database [J].
Hubbard, TJP ;
Ailey, B ;
Brenner, SE ;
Murzin, AG ;
Chothia, C .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :254-256
[9]   Domains, motifs and clusters in the protein universe [J].
Liu, JF ;
Rost, B .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2003, 7 (01) :5-11
[10]   InterPro, progress and status in 2005 [J].
Mulder, NJ ;
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Binns, D ;
Bradley, P ;
Bork, P ;
Bucher, P ;
Cerutti, L ;
Copley, R ;
Courcelle, E ;
Das, U ;
Durbin, R ;
Fleischmann, W ;
Gough, J ;
Haft, D ;
Harte, N ;
Hulo, N ;
Kahn, D ;
Kanapin, A ;
Krestyaninova, M ;
Lonsdale, D ;
Lopez, R ;
Letunic, I ;
Madera, M ;
Maslen, J ;
McDowall, J ;
Mitchell, A ;
Nikolskaya, AN ;
Orchard, S ;
Pagni, M ;
Pointing, CP ;
Quevillon, E ;
Selengut, J ;
Sigrist, CJA ;
Silventoinen, V ;
Studholme, DJ ;
Vaughan, R ;
Wu, CH .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D201-D205