TIGRFAMs and Genome Properties in 2013

被引:396
作者
Haft, Daniel H. [1 ]
Selengut, Jeremy D. [1 ]
Richter, Roland A. [2 ]
Harkins, Derek [1 ]
Basu, Malay K. [1 ]
Beck, Erin [1 ]
机构
[1] J Craig Venter Inst, Rockville, MD 20850 USA
[2] J Craig Venter Inst, La Jolla, CA 92121 USA
基金
美国国家卫生研究院;
关键词
PROTEIN; DATABASE; FAMILY;
D O I
10.1093/nar/gks1234
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
TIGRFAMs, available online at http://www.jcvi.org/tigrfams is a database of protein family definitions. Each entry features a seed alignment of trusted representative sequences, a hidden Markov model (HMM) built from that alignment, cutoff scores that let automated annotation pipelines decide which proteins are members, and annotations for transfer onto member proteins. Most TIGRFAMs models are designated equivalog, meaning they assign a specific name to proteins conserved in function from a common ancestral sequence. Models describing more functionally heterogeneous families are designated subfamily or domain, and assign less specific but more widely applicable annotations. The Genome Properties database, available at ext-link-type="uri" xlink:href="http://www.jcvi.org/genome-properties" xmlns:xlink="http://www.w3.org/1999/xlink">http://www.jcvi.org/genome-properties, specifies how computed evidence, including TIGRFAMs HMM results, should be used to judge whether an enzymatic pathway, a protein complex or another type of molecular subsystem is encoded in a genome. TIGRFAMs and Genome Properties content are developed in concert because subsystems reconstruction for large numbers of genomes guides selection of seed alignment sequences and cutoff values during protein family construction. Both databases specialize heavily in bacterial and archaeal subsystems. At present, 4284 models appear in TIGRFAMs, while 628 systems are described by Genome Properties. Content derives both from subsystem discovery work and from biocuration of the scientific literature.
引用
收藏
页码:D387 / D395
页数:9
相关论文
共 26 条
[1]   ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process [J].
Basu, Malay K. ;
Selengut, Jeremy D. ;
Haft, Daniel H. .
BMC BIOINFORMATICS, 2011, 12
[2]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[3]   The comprehensive microbial resource [J].
Davidsen, Tanja ;
Beck, Erin ;
Ganapathy, Anuradha ;
Montgomery, Robert ;
Zafar, Nikhat ;
Yang, Qi ;
Madupu, Ramana ;
Goetz, Phil ;
Galinsky, Kevin ;
White, Owen ;
Sutton, Granger .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D340-D345
[4]  
Eddy Sean R, 2009, Genome Inform, V23, P205
[5]   HMMER web server: interactive sequence similarity searching [J].
Finn, Robert D. ;
Clements, Jody ;
Eddy, Sean R. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :W29-W37
[6]   GlyGly-CTERM and Rhombosortase: A C-Terminal Protein Processing Signal in a Many-to-One Pairing with a Rhomboid Family Intramembrane Serine Protease [J].
Haft, Daniel H. ;
Varghese, Neha .
PLOS ONE, 2011, 6 (12)
[7]   Archaeosortases and Exosortases Are Widely Distributed Systems Linking Membrane Transit with Posttranslational Modification [J].
Haft, Daniel H. ;
Payne, Samuel H. ;
Selengut, Jeremy D. .
JOURNAL OF BACTERIOLOGY, 2012, 194 (01) :36-48
[8]   Biological Systems Discovery In Silico: Radical S-Adenosylmethionine Protein Families and Their Target Peptides for Posttranslational Modification [J].
Haft, Daniel H. ;
Basu, Malay Kumar .
JOURNAL OF BACTERIOLOGY, 2011, 193 (11) :2745-2755
[9]   Orphan SeID proteins and selenium-dependent molybdenum hydroxylases [J].
Haft, Daniel H. ;
Self, William T. .
BIOLOGY DIRECT, 2008, 3 (1)
[10]   A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes [J].
Haft, DH ;
Selengut, J ;
Mongodin, EF ;
Nelson, KE .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (06) :474-483