GeSeq - versatile and accurate annotation of organelle genomes

被引:2614
作者
Tillich, Michael [1 ]
Lehwark, Pascal [2 ]
Pellizzer, Tommaso [1 ]
Ulbricht-Jones, Elena S. [1 ]
Fischer, Axel [1 ]
Bock, Ralph [1 ]
Greiner, Stephan [1 ]
机构
[1] Max Planck Inst Mol Pflanzenphysiol, Muhlenberg 1, D-14476 Potsdam, Germany
[2] Glogauer Str 31, D-10999 Berlin, Germany
关键词
MULTIPLE SEQUENCE ALIGNMENT; TRANSFER-RNA GENES; MITOCHONDRIAL GENOMES; NUCLEOTIDE-SEQUENCES; PLANTS; PROGRAM; MUSCLE; MAPS; TOOL;
D O I
10.1093/nar/gkx391
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
070307 [化学生物学]; 071010 [生物化学与分子生物学];
摘要
We have developed the web application GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq.html) for the rapid and accurate annotation of organellar genome sequences, in particular chloroplast genomes. In contrast to existing tools, GeSeq combines batch processing with a fully customizable reference sequence selection of organellar genome records from NCBI and/or references uploaded by the user. For the annotation of chloroplast genomes, the application additionally provides an integrated database of manually curated reference sequences. GeSeq identifies genes or other feature-encoding regions by BLAT-based homology searches and additionally, by profile HMM searches for protein and rRNA coding genes and two de novo predictors for tRNA genes. These unique features enable the user to conveniently compare the annotations of different state-of-the-art methods, thus supporting high-quality annotations. The main output of GeSeq is a GenBank file that usually requires only little curation and is instantly visualized by OGDRAW. GeSeq also offers a variety of optional additional outputs that facilitate downstream analyzes, for example comparative genomic or phylogenetic studies.
引用
收藏
页码:W6 / W11
页数:6
相关论文
共 20 条
[1]
TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations [J].
Abascal, Federico ;
Zardoya, Rafael ;
Telford, Maximilian J. .
NUCLEIC ACIDS RESEARCH, 2010, 38 :W7-W13
[2]
Evolutionary constraints on the plastid tRNA set decoding methionine and isoleucine [J].
Alkatib, Sibah ;
Fleischmann, Tobias T. ;
Scharff, Lars B. ;
Bock, Ralph .
NUCLEIC ACIDS RESEARCH, 2012, 40 (14) :6713-6724
[3]
Insights into the Evolution of Mitochondrial Genome Size from Complete Sequences of Citrullus lanatus and Cucurbita pepo (Cucurbitaceae) [J].
Alverson, Andrew J. ;
Wei, XiaoXin ;
Rice, Danny W. ;
Stern, David B. ;
Barry, Kerrie ;
Palmer, Jeffrey D. .
MOLECULAR BIOLOGY AND EVOLUTION, 2010, 27 (06) :1436-1448
[4]
[5]
The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes [J].
Drescher, A ;
Ruf, S ;
Calsa, T ;
Carrer, H ;
Bock, R .
PLANT JOURNAL, 2000, 22 (02) :97-104
[6]
RNA SEQUENCE-ANALYSIS USING COVARIANCE-MODELS [J].
EDDY, SR ;
DURBIN, R .
NUCLEIC ACIDS RESEARCH, 1994, 22 (11) :2079-2088
[7]
MUSCLE: a multiple sequence alignment method with reduced time and space complexity [J].
Edgar, RC .
BMC BIOINFORMATICS, 2004, 5 (1) :1-19
[8]
MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[9]
Kent WJ, 2002, GENOME RES, V12, P656, DOI 10.1101/gr.229202. Article published online before March 2002
[10]
ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences [J].
Laslett, D ;
Canback, B .
NUCLEIC ACIDS RESEARCH, 2004, 32 (01) :11-16