Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor

被引:671
作者
Kohany, Oleksiy
Gentles, Andrew J.
Hankus, Lukasz
Jurka, Jerzy
机构
[1] Genet Informat Res Inst, Mountain View, CA 94043 USA
[2] Stanford Univ, Sch Med, Stanford, CA 94301 USA
关键词
D O I
10.1186/1471-2105-7-474
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. Results: We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA- translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence( s), repeat sequences found in the query, and alignments. Conclusion: Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html ( RepbaseSubmitter) and http://www.girinst.org/censor/index.php ( Censor).
引用
收藏
页数:7
相关论文
共 10 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 1996, RepeatMasker
[3]   MaskerAid:: a performance enhancement to RepeatMasker [J].
Bedell, JA ;
Korf, I ;
Gish, W .
BIOINFORMATICS, 2000, 16 (11) :1040-1041
[4]   Repbase update, a database of eukaryotic repetitive elements [J].
Jurka, J ;
Kapitonov, VV ;
Pavlicek, A ;
Klonowski, P ;
Kohany, O ;
Walichiewicz, J .
CYTOGENETIC AND GENOME RESEARCH, 2005, 110 (1-4) :462-467
[5]   Censor - A program for identification and elimination of repetitive elements from DNA sequences [J].
Jurka, J ;
Klonowski, P ;
Dagman, V ;
Pelton, P .
COMPUTERS & CHEMISTRY, 1996, 20 (01) :119-121
[6]   Detection of new transposable element families in Drosophila melanogaster and Anopheles gambiae genomes [J].
Quesneville, H ;
Nouaud, D ;
Anxolabéhère, D .
JOURNAL OF MOLECULAR EVOLUTION, 2003, 57 (Suppl 1) :S50-S59
[7]   Empirical determination of effective gap penalties for sequence comparison [J].
Reese, JT ;
Pearson, WR .
BIOINFORMATICS, 2002, 18 (11) :1500-1507
[8]   Evolutionary models for insertions and deletions in a probabilistic modeling framework [J].
Rivas, E .
BMC BIOINFORMATICS, 2005, 6 (1)
[9]   IDENTIFICATION OF COMMON MOLECULAR SUBSEQUENCES [J].
SMITH, TF ;
WATERMAN, MS .
JOURNAL OF MOLECULAR BIOLOGY, 1981, 147 (01) :195-197
[10]   SEQUENCE ALIGNMENT AND PENALTY CHOICE - REVIEW OF CONCEPTS, CASE-STUDIES AND IMPLICATIONS [J].
VINGRON, M ;
WATERMAN, MS .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (01) :1-12