In search of lost introns

被引:25
作者
Csuroes, Miklos [1 ]
Holey, J. Andrew
Rogozin, Igor B.
机构
[1] Univ Montreal, Dept Comp Sci & Operat Res, Quebec City, PQ, Canada
[2] St Johns Univ, Dept Comp Sci, Collegeville, MN 56321 USA
[3] Coll St Benedict, Collegeville, MN USA
[4] NIH, Natl Lib Med, Natl Ctr Biotechnol Informat, Bethesda, MD 20892 USA
关键词
D O I
10.1093/bioinformatics/btm190
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Many fundamental questions concerning the emergence and subsequent evolution of eukaryotic exon-intron organization are still unsettled. Genome-scale comparative studies, which can shed light on crucial aspects of eukaryotic evolution, require adequate computational tools. We describe novel computational methods for studying spliceosomal intron evolution. Our goal is to give a reliable characterization of the dynamics of intron evolution. Our algorithmic innovations address the identification of orthologous introns, and the likelihood-based analysis of intron data. We discuss a compression method for the evaluation of the likelihood function, which is noteworthy for phylogenetic likelihood problems in general. We prove that after O(nl) preprocessing time, subsequent evaluations take O(nl/logl) time almost surely in the Yule-Harding random model of n-taxon phylogenies, where l is the input sequence length. We illustrate the practicality of our methods by compiling and analyzing a data set involving 18 eukaryotes, which is more than in any other study to date. The study yields the surprising result that ancestral eukaryotes were fairly intron-rich. For example, the bilaterian ancestor is estimated to have had more than 90% as many introns as vertebrates do now.
引用
收藏
页码:I87 / I96
页数:10
相关论文
共 51 条
[31]  
Nguyen HD, 2005, PLOS COMPUT BIOL, V1, P631, DOI 10.1371/journal.pcbi.0010079
[32]  
NIELSEN CB, 1997, PLOS BIOL
[33]   A spliceosomal intron in Giardia lamblia [J].
Nixon, JEJ ;
Wang, A ;
Morrison, HG ;
McArthur, AG ;
Sogin, ML ;
Loftus, BJ ;
Samuelson, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (06) :3701-3705
[34]   Column sorting: Rapid calculation of the phylogenetic likelihood function [J].
Pond, SLK ;
Muse, SV .
SYSTEMATIC BIOLOGY, 2004, 53 (05) :685-692
[35]  
Press W. H., 1997, NUMERICAL RECIPES C
[36]   NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins [J].
Pruitt, Kim D. ;
Tatusova, Tatiana ;
Maglott, Donna R. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D61-D65
[37]   Vertebrate-type intron-rich genes in the marine annelid Platynereis dumerilii [J].
Raible, F ;
Tessmar-Raible, K ;
Osoegawa, K ;
Wincker, P ;
Jubin, C ;
Balavoine, G ;
Ferrier, D ;
Benes, V ;
de Jong, P ;
Weissenbach, J ;
Bork, P ;
Arendt, D .
SCIENCE, 2005, 310 (5752) :1325-1326
[38]   Analysis of evolution of exon-intron structure of eukaryotic genes [J].
Rogozin, IB ;
Sverdlov, AV ;
Babenko, VN ;
Koonin, EV .
BRIEFINGS IN BIOINFORMATICS, 2005, 6 (02) :118-134
[39]   Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution [J].
Rogozin, IB ;
Wolf, YI ;
Sorokin, AV ;
Mirkin, BG ;
Koonin, EV .
CURRENT BIOLOGY, 2003, 13 (17) :1512-1517
[40]   The mean and variance of the numbers of r-pronged nodes and r-caterpillars in yule-generated genealogical trees [J].
Rosenberg, Noah A. .
ANNALS OF COMBINATORICS, 2006, 10 (01) :129-146