PASTEC: An Automatic Transposable Element Classification Tool

被引:214
作者
Hoede, Claire [1 ,2 ]
Arnoux, Sandie [1 ,3 ]
Moisset, Mark [1 ]
Chaumier, Timothee [1 ]
Inizan, Olivier [1 ]
Jamilloux, Veronique [1 ]
Quesneville, Hadi [1 ]
机构
[1] INRA, URGI Res Unit Genom Info UR1164, F-78026 Versailles, France
[2] INRA, Plateforme Bioinformat Genotoul Math & Informat A, UR875, F-31326 Castanet Tolosan, France
[3] INRA, LUNAM Universite, BioEpAR UMR1300, F-44026 Nantes, France
关键词
SEQUENCES;
D O I
10.1371/journal.pone.0091929
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
The classification of transposable elements (TEs) is key step towards deciphering their potential impact on the genome. However, this process is often based on manual sequence inspection by TE experts. With the wealth of genomic sequences now available, this task requires automation, making it accessible to most scientists. We propose a new tool, PASTEC, which classifies TEs by searching for structural features and similarities. This tool outperforms currently available software for TE classification. The main innovation of PASTEC is the search for HMM profiles, which is useful for inferring the classification of unknown TE on the basis of conserved functional domains of the proteins. In addition, PASTEC is the only tool providing an exhaustive spectrum of possible classifications to the order level of the Wicker hierarchical TE classification system. It can also automatically classify other repeated elements, such as SSR (Simple Sequence Repeats), rDNA or potential repeated host genes. Finally, the output of this new tool is designed to facilitate manual curation by providing to biologists with all the evidence accumulated for each TE consensus.
引用
收藏
页数:6
相关论文
共 9 条
[1]
TEclass-a tool for automated classification of unknown eukaryotic transposable elements [J].
Abrusan, Gyorgy ;
Grundmann, Norbert ;
DeMester, Luc ;
Makalowski, Wojciech .
BIOINFORMATICS, 2009, 25 (10) :1329-1330
[2]
Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[3]
Discovering and detecting transposable elements in genome sequences [J].
Bergman, Casey M. ;
Quesneville, Hadi .
BRIEFINGS IN BIOINFORMATICS, 2007, 8 (06) :382-392
[4]
Accelerated Profile HMM Searches [J].
Eddy, Sean R. .
PLOS COMPUTATIONAL BIOLOGY, 2011, 7 (10)
[5]
Exploring Repetitive DNA Landscapes Using REPCLASS, a Tool That Automates the Classification of Transposable Elements in Eukaryotic Genomes [J].
Feschotte, Cedric ;
Keswani, Umeshkumar ;
Ranganathan, Nirmal ;
Guibotsy, Marcel L. ;
Levine, David .
GENOME BIOLOGY AND EVOLUTION, 2009, 1 :205-220
[6]
Considering Transposable Element Diversification in De Novo Annotation Approaches [J].
Flutre, Timothee ;
Duprat, Elodie ;
Feuillet, Catherine ;
Quesneville, Hadi .
PLOS ONE, 2011, 6 (01)
[7]
Repbase update, a database of eukaryotic repetitive elements [J].
Jurka, J ;
Kapitonov, VV ;
Pavlicek, A ;
Klonowski, P ;
Kohany, O ;
Walichiewicz, J .
CYTOGENETIC AND GENOME RESEARCH, 2005, 110 (1-4) :462-467
[8]
Permal Emmanuelle, 2012, Methods Mol Biol, V859, P53, DOI 10.1007/978-1-61779-603-6_3
[9]
A unified classification system for eukaryotic transposable elements [J].
Wicker, Thomas ;
Sabot, Francois ;
Hua-Van, Aurelie ;
Bennetzen, Jeffrey L. ;
Capy, Pierre ;
Chalhoub, Boulos ;
Flavell, Andrew ;
Leroy, Philippe ;
Morgante, Michele ;
Panaud, Olivier ;
Paux, Etienne ;
SanMiguel, Phillip ;
Schulman, Alan H. .
NATURE REVIEWS GENETICS, 2007, 8 (12) :973-982