PhyloGenie: automated phylome generation and analysis

被引:86
作者
Frickey, T [1 ]
Lupas, AN [1 ]
机构
[1] Max Planck Inst Dev Biol, Dept Prot Evolut, D-72076 Tubingen, Germany
关键词
D O I
10.1093/nar/gkh867
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Phylogenetic reconstruction is the method of choice to determine the homologous relationships between sequences. Difficulties in producing high-quality alignments, which are the basis of good trees, and in automating the analysis of trees have unfortunately limited the use of phylogenetic reconstruction methods to individual genes or gene families. Due to the large number of sequences involved, phylogenetic analyses of proteomes preclude manual steps and therefore require a high degree of automation in sequence selection, alignment, phylogenetic inference and analysis of the resulting set of trees. We present a set of programs that automates the steps from seed sequence to phylogeny and a utility to extract all phylogenies that match specific topological constraints from a database of trees. Two example applications that show the type of questions that can be answered by phylome analysis are provided. The generation and analysis of the Thermoplasma acidophilum phylome with regard to lateral gene transfer between Thermoplasmata and Sulfolobus, showed best BLAST hits to be far less reliable indicators of lateral transfer than the corresponding protein phylogenies.The generation and analysis of the Danio rerio phylome provided more than twice as many proteins as described previously, supporting the hypothesis of an additional round of genome duplication in the actinopterygian lineage.
引用
收藏
页码:5231 / 5238
页数:8
相关论文
共 20 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Zebrafish hox clusters and vertebrate genome evolution [J].
Amores, A ;
Force, A ;
Yan, YL ;
Joly, L ;
Amemiya, C ;
Fritz, A ;
Ho, RK ;
Langeland, J ;
Prince, V ;
Wang, YL ;
Westerfield, M ;
Ekker, M ;
Postlethwait, JH .
SCIENCE, 1998, 282 (5394) :1711-1714
[3]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[4]   KARYOTYPIC ANALYSIS AND EVIDENCE OF TETRAPLOIDY IN NORTH-AMERICAN PADDLEFISH, POLYODON-SPATHULA [J].
DINGERKUS, G ;
HOWELL, WM .
SCIENCE, 1976, 194 (4267) :842-844
[5]   Hidden Markov models [J].
Eddy, SR .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) :361-365
[6]   Evolution of two-component signal transduction [J].
Koretke, KK ;
Lupas, AN ;
Warren, PV ;
Rosenberg, M ;
Brown, JR .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (12) :1956-1970
[7]   The closest BLAST hit is often not the nearest neighbor [J].
Koski, LB ;
Golding, GB .
JOURNAL OF MOLECULAR EVOLUTION, 2001, 52 (06) :540-542
[8]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[9]   Genomics - Genes lost during evolution [J].
Roelofs, J ;
Van Haastert, PJM .
NATURE, 2001, 411 (6841) :1013-1014
[10]   The genome sequence of the thermoacidophilic scavenger Thermoplasma acidophilum [J].
Ruepp, A ;
Graml, W ;
Santos-Martinez, ML ;
Koretle, KK ;
Volker, C ;
Mewes, HW ;
Frishman, D ;
Stocker, S ;
Lupas, AN ;
Baumeister, W .
NATURE, 2000, 407 (6803) :508-513