Accurate identification of orthologous segments among multiple genomes

被引:25
作者
Hachiya, Tsuyoshi [1 ]
Osana, Yasunori [2 ]
Popendorf, Kris [1 ]
Sakakibara, Yasubumi [1 ]
机构
[1] Keio Univ, Dept Biosci & Informat, Tokyo 108, Japan
[2] Seikei Univ, Dept Comp & Informat Sci, Musashino, Tokyo, Japan
关键词
REARRANGEMENTS; MOUSE; DATABASE; EVOLUTION; SEQUENCES; REGIONS; VISUALIZATION; HOMOLOGY; PARALOGS; LESSONS;
D O I
10.1093/bioinformatics/btp070
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The accurate detection of orthologous segments (also referred to as syntenic segments) plays a key role in comparative genomics, as it is useful for inferring genome rearrangement scenarios and computing whole-genome alignments. Although a number of algorithms for detecting orthologous segments have been proposed, none of them contain a framework for optimizing their parameter values. Methods: In the present study, we propose an algorithm, named OSfinder (Orthologous Segment finder), which uses a novel scoring scheme based on stochastic models. OSfinder takes as input the positions of short homologous regions (also referred to as anchors) and explicitly discriminates orthologous anchors from non-orthologous anchors by using Markov chain models which represent respective geometric distributions of lengths of orthologous and non-orthologous anchors. Such stochastic modeling makes it possible to optimize parameter values by maximizing the likelihood of the input dataset, and to automate the setting of the optimal parameter values. Results: We validated the accuracies of orthology-mapping algorithms on the basis of their consistency with the orthology annotation of genes. Our evaluation tests using mammalian and bacterial genomes demonstrated that OSfinder shows higher accuracy than previous algorithms.
引用
收藏
页码:853 / 860
页数:8
相关论文
共 37 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Numerous small rearrangements of gene content, order and orientation differentiate grass genomes [J].
Bennetzen, JL ;
Ramakrishna, W .
PLANT MOLECULAR BIOLOGY, 2002, 48 (05) :821-827
[4]   Aligning multiple genomic sequences with the threaded blockset aligner [J].
Blanchette, M ;
Kent, WJ ;
Riemer, C ;
Elnitski, L ;
Smit, AFA ;
Roskin, KM ;
Baertsch, R ;
Rosenbloom, K ;
Clawson, H ;
Green, ED ;
Haussler, D ;
Miller, W .
GENOME RESEARCH, 2004, 14 (04) :708-715
[5]   Comparative architectures of mammalian and chicken genomes reveal highly variable rates of genomic rearrangements across different lineages [J].
Bourque, G ;
Zdobnov, EM ;
Bork, P ;
Pevzner, PA ;
Tesler, G .
GENOME RESEARCH, 2005, 15 (01) :98-110
[6]   Reconstructing the genomic architecture of ancestral mammals: Lessons from human, mouse, and rat genomes [J].
Bourque, G ;
Pevzner, PA ;
Tesler, G .
GENOME RESEARCH, 2004, 14 (04) :507-516
[7]   Fast identification and statistical evaluation of segmental homologies in comparative maps [J].
Calabrese, Peter P. ;
Chakravarty, Sugata ;
Vision, Todd J. .
BIOINFORMATICS, 2003, 19 :i74-i80
[8]   DiagHunter and GenoPix2D: programs for genomic comparisons, large-scale homology discovery and visualization [J].
Cannon, SB ;
Kozik, A ;
Chan, B ;
Michelmore, R ;
Young, ND .
GENOME BIOLOGY, 2003, 4 (10)
[9]  
Dewey Colin N., 2007, V395, P221
[10]   Parametric alignment of Drosophila genomes [J].
Dewey, Colin N. ;
Huggins, Peter M. ;
Woods, Kevin ;
Sturmfels, Bernd ;
Pachter, Lior .
PLOS COMPUTATIONAL BIOLOGY, 2006, 2 (06) :606-614