Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome

被引:219
作者
Tcherepanov, Vasily [1 ]
Ehlers, Angelika [1 ]
Upton, Chris [1 ]
机构
[1] Univ Victoria, Dept Microbiol & Biochem, Victoria, BC V8W 3P6, Canada
关键词
D O I
10.1186/1471-2164-7-150
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 [微生物学]; 0836 [生物工程]; 090102 [作物遗传育种]; 100705 [微生物与生化药学];
摘要
Background: Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced. However, many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. While maintaining comprehensive genomic databases for a set of virus families at the Viral Bioinformatics Resource Center http://www.biovirus.org and Viral Bioinformatics - Canada http://www.virology.ca, we found that researchers were unnecessarily spending time annotating viral genomes that were close relatives of already annotated viruses. We have therefore designed and implemented a novel tool, Genome Annotation Transfer Utility (GATU), to transfer annotations from a previously annotated reference genome to a new target genome, thereby greatly reducing this laborious task. Results: GATU transfers annotations from a reference genome to a closely related target genome, while still giving the user final control over which annotations should be included. GATU also detects open reading frames present in the target but not the reference genome and provides the user with a variety of bioinformatics tools to quickly determine if these ORFs should also be included in the annotation. After this process is complete, GATU saves the newly annotated genome as a GenBank, EMBL or XML-format file. The software is coded in Java and runs on a variety of computer platforms. Its user-friendly Graphical User Interface is specifically designed for users trained in the biological sciences. Conclusion: GATU greatly simplifies the initial stages of genome annotation by using a closely related genome as a reference. It is not intended to be a gene prediction tool or a "complete" annotation system, but we have found that it significantly reduces the time required for annotation of genes and mature peptides as well as helping to standardize gene names between related organisms by transferring reference genome annotations to the target genome.
引用
收藏
页数:10
相关论文
共 8 条
[1]
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]
Base-By-Base: Single nucleotide-level analysis of whole viral genome alignments [J].
Brodie, R ;
Smith, AJ ;
Roper, RL ;
Tcherepanov, V ;
Upton, C .
BMC BIOINFORMATICS, 2004, 5 (1)
[3]
JDotter: a Java']Java interface to multiple dotplots generated by dotter [J].
Brodie, R ;
Roper, RL ;
Upton, C .
BIOINFORMATICS, 2004, 20 (02) :279-281
[4]
The PEDANT genome database [J].
Frishman, D ;
Mokrejs, M ;
Kosykh, D ;
Kastenmüller, G ;
Kolesov, G ;
Zubrzycki, I ;
Gruber, C ;
Geier, B ;
Kaps, A ;
Albermann, K ;
Volz, A ;
Wagner, C ;
Fellenberg, M ;
Heumann, K ;
Mewes, HW .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :207-211
[5]
A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS [J].
NEEDLEMAN, SB ;
WUNSCH, CD .
JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) :443-+
[6]
EMBOSS: The European molecular biology open software suite [J].
Rice, P ;
Longden, I ;
Bleasby, A .
TRENDS IN GENETICS, 2000, 16 (06) :276-277
[7]
The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools [J].
Thompson, JD ;
Gibson, TJ ;
Plewniak, F ;
Jeanmougin, F ;
Higgins, DG .
NUCLEIC ACIDS RESEARCH, 1997, 25 (24) :4876-4882
[8]
Viral genome organizer: a system for analyzing complete viral genomes [J].
Upton, C ;
Hogg, D ;
Perrin, D ;
Boone, M ;
Harris, NL .
VIRUS RESEARCH, 2000, 70 (1-2) :55-64