PhyloGena - a user-friendly system for automated phylogenetic annotation of unknown sequences

被引:25
作者
Hanekamp, Kristian
Bohnebeck, Uta
Beszteri, Bank
Valentin, Klaus
机构
[1] Alfred Wegener Inst Polar & Marine Res, D-27570 Bremerhaven, Germany
[2] Ctr Comp Technol TZI, D-28334 Bremen, Germany
[3] Technol Transfer Ctr, TTZ BIBIS, D-27568 Bremerhaven, Germany
关键词
D O I
10.1093/bioinformatics/btm016
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Phylogenomic approaches towards functional and evolutionary annotation of unknown sequences have been suggested to be superior to those based only on pairwise local alignments. User-friendly software tools making the advantages of phylogenetic annotation available for the ever widening range of bioinformatically uninitiated biologists involved in genome/EST annotation projects are, however, not available. We were particularly confronted with this issue in the annotation of sequences from different groups of complex algae originating from secondary endosymbioses, where the identification of the phylogenetic origin of genes is often more problematic than in taxa well represented in the databases (e.g. animals, plants or fungi). Results: We present a flexible pipeline with a user-friendly, interactive graphical user interface running on desktop computers that automatically performs a basic local alignment search tool (BLAST) search of query sequences, selects a representative subset of them, then creates a multiple alignment from the selected sequences, and finally computes a phylogenetic tree. The pipeline, named PhyloGena, uses public domain software for all standard bioinformatics tasks (similarity search, multiple alignment, and phylogenetic reconstruction). As the major technological innovation, selection of a meaningful subset of BLAST hits was implemented using logic programing, mimicing the selection procedure (BLAST tables, multiple alignments and phylogenetic trees) are displayed graphically, allowing the user to interact with the pipeline and deduce the function and phylogenetic origin of the query. PhyloGena thus makes phylogenomic annotation available also for those biologists without access to large computing facilities and with little informatics background. Although phylogenetic annotation is particularly useful when working with composite genomes (e.g. from complex algae), PhyloGena can be helpful in expressed sequence tag and genome annotation also in other organisms.
引用
收藏
页码:793 / 801
页数:9
相关论文
共 31 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2004, PHYLIP PHYLOGENY INF
[3]   The genome of the diatom Thalassiosira pseudonana:: Ecology, evolution, and metabolism [J].
Armbrust, EV ;
Berges, JA ;
Bowler, C ;
Green, BR ;
Martinez, D ;
Putnam, NH ;
Zhou, SG ;
Allen, AE ;
Apt, KE ;
Bechner, M ;
Brzezinski, MA ;
Chaal, BK ;
Chiovitti, A ;
Davis, AK ;
Demarest, MS ;
Detter, JC ;
Glavina, T ;
Goodstein, D ;
Hadi, MZ ;
Hellsten, U ;
Hildebrand, M ;
Jenkins, BD ;
Jurka, J ;
Kapitonov, VV ;
Kröger, N ;
Lau, WWY ;
Lane, TW ;
Larimer, FW ;
Lippmeier, JC ;
Lucas, S ;
Medina, M ;
Montsant, A ;
Obornik, M ;
Parker, MS ;
Palenik, B ;
Pazour, GJ ;
Richardson, PM ;
Rynearson, TA ;
Saito, MA ;
Schwartz, DC ;
Thamatrakoln, K ;
Valentin, K ;
Vardi, A ;
Wilkerson, FP ;
Rokhsar, DS .
SCIENCE, 2004, 306 (5693) :79-86
[4]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[5]   The deep roots of eukaryotes [J].
Baldauf, SL .
SCIENCE, 2003, 300 (5626) :1703-1706
[6]   PhyloBLAST: facilitating phylogenetic analysis of BLAST results [J].
Brinkman, FSL ;
Wan, I ;
Hancock, REW ;
Rose, AM ;
Jones, SJ .
BIOINFORMATICS, 2001, 17 (04) :385-387
[7]   The Jalview Java']Java alignment editor [J].
Clamp, M ;
Cuff, J ;
Searle, SM ;
Barton, GJ .
BIOINFORMATICS, 2004, 20 (03) :426-427
[8]  
DENTI E, 2001, P 3 INT S PADL 01
[9]   BIBI, a bioinformatics bacterial identification tool [J].
Devulder, G ;
Perrière, G ;
Baty, F ;
Flandrois, JP .
JOURNAL OF CLINICAL MICROBIOLOGY, 2003, 41 (04) :1785-1787
[10]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797