RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs

被引:129
作者
Zmasek, CM
Eddy, SR [1 ]
机构
[1] Washington Univ, Sch Med, Howard Hughes Med Inst, St Louis, MO 63110 USA
[2] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63110 USA
关键词
D O I
10.1186/1471-2105-3-14
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication). The utility of phylogenetic information in high-throughput genome annotation ("phylogenomics") is widely recognized, but existing approaches are either manual or not explicitly based on phylogenetic trees. Results: Here we present RIO (Resampled Inference of Orthologs), a procedure for automated phylogenomics using explicit phylogenetic inference. RIO analyses are performed over bootstrap resampled phylogenetic trees to estimate the reliability of orthology assignments. We also introduce supplementary concepts that are helpful for functional inference. RIO has been implemented as Perl pipeline connecting several C and Java programs. It is available at [http:// www.genetics.wusti.edu/eddy/forester/]. A web server is at [http://www.rio.wusti.edu/]. RIO was tested on the Arabidopsis thaliana and Coenorhabditis elegans proteomes. Conclusion: The RIO procedure is particularly useful for the automated detection of first representatives of novel protein subfamilies. We also describe how some orthologies can be misleading for functional inference.
引用
收藏
页数:19
相关论文
共 52 条
[11]   Genome sequence of the nematode C-elegans:: A platform for investigating biology [J].
不详 .
SCIENCE, 1998, 282 (5396) :2012-2018
[12]   YPD™, PombePD™ and WormPD™:: model organism volumes of the BioKnowledge™ Library, an integrated resource for protein information [J].
Costanzo, MC ;
Crawford, ME ;
Hirschman, JE ;
Kranz, JE ;
Olsen, P ;
Robertson, LS ;
Skrzypek, MS ;
Braun, BR ;
Hopkins, KL ;
Kondu, P ;
Lengieza, C ;
Lew-Smith, JE ;
Tillberg, M ;
Garrels, JI .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :75-79
[13]  
DAYHOFF MO, 1976, FED PROC, V35, P2132
[14]   Hox genes in brachiopods and priapulids and protostome evolution [J].
de Rosa, R ;
Grenier, JK ;
Andreeva, T ;
Cook, CE ;
Adoutte, A ;
Akam, M ;
Carroll, SB ;
Balavoine, G .
NATURE, 1999, 399 (6738) :772-776
[15]  
DENNIS D, 1960, J BIOL CHEM, V235, P810
[16]   THE GENEALOGY OF SOME RECENTLY EVOLVED VERTEBRATE PROTEINS [J].
DOOLITTLE, RF .
TRENDS IN BIOCHEMICAL SCIENCES, 1985, 10 (06) :233-237
[17]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[18]   Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis [J].
Eisen, JA .
GENOME RESEARCH, 1998, 8 (03) :163-167
[19]  
Eulenstein O, 1998, GMD RES SERIES, V20
[20]  
FELSENSTEIN J, 1985, EVOLUTION, V39, P783, DOI 10.1111/j.1558-5646.1985.tb00420.x