Automated ortholog inference from phylogenetic trees and calculation of orthology reliability

被引:108
作者
Storm, CEV [1 ]
Sonnhammer, ELL [1 ]
机构
[1] Karolinska Inst, Ctr Genom & Bioinformat, S-17177 Stockholm, Sweden
关键词
D O I
10.1093/bioinformatics/18.1.92
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Orthologous proteins in different species are likely to have similar biochemical function and biological role. When annotating a newly sequenced genome by sequence homology, the most precise and reliable functional information can thus be derived from orthologs in other species. A standard method of finding orthologs is to compare the sequence tree with the species tree. However, since the topology of phylogenetic tree is not always reliable one might get incorrect assignments. Results: Here we present a novel method that resolves this problem by analyzing a set of bootstrap trees instead of the optimal tree. The frequency of orthology assignments in the bootstrap trees can be interpreted as a support value for the possible orthology of the sequences. Our method is efficient enough to analyze data in the scale of whole genomes. It is implemented in Java and calculates orthology support levels for all pairwise combinations of homologous sequences of two species. The method was tested on simulated datasets and on real data of homologous proteins.
引用
收藏
页码:92 / 99
页数:8
相关论文
共 19 条
[1]   Bootstrap confidence levels for phylogenetic trees (vol 93, pg 7085, 1996) [J].
Efron, B ;
Halloran, E ;
Holmes, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (23) :13429-13434
[2]  
FELSENSTEIN J, 1985, EVOLUTION, V39, P783, DOI 10.1111/j.1558-5646.1985.tb00420.x
[3]   Homology - a personal view on some of the problems [J].
Fitch, WM .
TRENDS IN GENETICS, 2000, 16 (05) :227-231
[4]   DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS [J].
FITCH, WM .
SYSTEMATIC ZOOLOGY, 1970, 19 (02) :99-&
[5]   Orthologs, paralogs and genome comparisons [J].
Gogarten, JP ;
Olendzenski, L .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 1999, 9 (06) :630-636
[6]   FITTING THE GENE LINEAGE INTO ITS SPECIES LINEAGE, A PARSIMONY STRATEGY ILLUSTRATED BY CLADOGRAMS CONSTRUCTED FROM GLOBIN SEQUENCES [J].
GOODMAN, M ;
CZELUSNIAK, J ;
MOORE, GW ;
ROMEROHERRERA, AE ;
MATSUDA, G .
SYSTEMATIC ZOOLOGY, 1979, 28 (02) :132-163
[7]  
Meacham C., 2000, NEWICK TREE FORMAT
[8]   Large-scale taxonomic profiling of eukaryotic model organisms: A comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes [J].
Mushegian, AR ;
Garey, JR ;
Martin, J ;
Liu, LX .
GENOME RESEARCH, 1998, 8 (06) :590-598
[9]   Bootstrapping phylogenies: Large deviations and dispersion effects [J].
Newton, MA .
BIOMETRIKA, 1996, 83 (02) :315-328
[10]   GeneTree: comparing gene and species phylogenies using reconciled trees [J].
Page, RDM .
BIOINFORMATICS, 1998, 14 (09) :819-820