OrthologID: automation of genome-scale ortholog identification within a parsimony framework

被引:69
作者
Chiu, JC
Lee, EK
Egan, MG
Sarkar, IN
Coruzzi, GM [1 ]
DeSalle, R
机构
[1] NYU, Dept Biol, New York, NY 10003 USA
[2] Amer Museum Nat Hist, Div Invertebrate Zool, New York, NY 10024 USA
[3] Amer Museum Nat Hist, Div Lib Serv, New York, NY 10024 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/btk040
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The determination of gene orthology is a prerequisite for mining and utilizing the rapidly increasing amount of sequence data for genome-scale phylogenetics and comparative genomic studies. Until now, most researchers use pairwise distance comparisons algorithms, such as BLAST, COG, RBH, RSD and INPARANOID, to determine gene orthology. In contrast, orthology determination within a character-based phylogenetic framework has not been utilized on a genomic scale owing to the lack of efficiency and automation. Results: We have developed OrthologID, a Web application that automates the labor-intensive procedures of gene orthology determination within a character-based phylogenetic framework, thus making character-based orthology determination on a genomic scale possible. In addition to generating gene family trees and determining orthologous gene sets for complete genomes, OrthologID can also identify diagnostic characters that define each orthologous gene set, as well as diagnostic characters that are responsible for classifying query sequences from other genomes into specific orthology groups. The OrthologID database currently includes several complete plant genomes, including Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, as well as a unicellular outgroup, Chlamydomonas reinhardtii. To improve the general utility of OrthologID beyond plant species, we plan to expand our sequence database to include the fully sequenced genomes of prokaryotes and other non-plant eukaryotes.
引用
收藏
页码:699 / 707
页数:9
相关论文
共 42 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   A molecular phylogeny of the Canidae based on six nuclear loci [J].
Bardeleben, C ;
Moore, RL ;
Wayne, RK .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2005, 37 (03) :815-831
[3]   Expressed sequence tag analysis in Cycas, the most primitive living seed plant -: art. no. R78 [J].
Brenner, ED ;
Stevenson, DW ;
McCombie, RW ;
Katari, MS ;
Rudd, SA ;
Mayer, KFX ;
Palenchar, PM ;
Runko, SJ ;
Twigg, RW ;
Dai, GW ;
Martienssen, RA ;
Benfey, PN ;
Coruzzi, GM .
GENOME BIOLOGY, 2003, 4 (12)
[4]   Phylogeny of pholcid spiders (Araneae:Pholcidae):: Combined analysis using morphology and molecules [J].
Bruvo-Madaric, B ;
Huber, BA ;
Steinacher, A ;
Pass, G .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2005, 37 (03) :661-673
[5]   WEIGHTING, PARTITIONING, AND COMBINING CHARACTERS IN PHYLOGENETIC ANALYSIS [J].
CHIPPINDALE, PT ;
WIENS, JJ .
SYSTEMATIC BIOLOGY, 1994, 43 (02) :278-287
[6]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[7]   CASES IN WHICH PARSIMONY OR COMPATIBILITY METHODS WILL BE POSITIVELY MISLEADING [J].
FELSENSTEIN, J .
SYSTEMATIC ZOOLOGY, 1978, 27 (04) :401-410
[8]   Hidden morphological support for the phylogenetic placement of Pseudoryx nghetinhensis with bovine bovids:: A combined analysis of gross anatomical evidence and DNA sequences from five genes [J].
Gatesy, J ;
Arctander, P .
SYSTEMATIC BIOLOGY, 2000, 49 (03) :515-538
[9]   Stability of cladistic relationships between Cetacea and higher-level Artiodactyl taxa [J].
Gatesy, J ;
Milinkovitch, M ;
Waddell, V ;
Stanhope, M .
SYSTEMATIC BIOLOGY, 1999, 48 (01) :6-20
[10]   Combined support for wholesale taxic atavism in gavialine crocodylians [J].
Gatesy, J ;
Amato, G ;
Norell, M ;
DeSalle, R ;
Hayashi, C .
SYSTEMATIC BIOLOGY, 2003, 52 (03) :403-422