Operon prediction by comparative genomics:: an application to the Synechococcus sp WH8102 genome

被引:51
作者
Chen, X
Su, Z
Dam, P
Palenik, B
Xu, Y
Jiang, T
机构
[1] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
[2] Univ Georgia, Dept Biochem & Mol Biol, Athens, GA 30602 USA
[3] Oak Ridge Natl Lab, Inst Computat Biol, Oak Ridge, TN USA
[4] Univ Calif San Diego, Scripps Inst Oceanog, La Jolla, CA 92093 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/nar/gkh510
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a computational method for operon prediction based on a comparative genomics approach. A group of consecutive genes is considered as a candidate operon if both their gene sequences and functions are conserved across several phylogenetically related genomes. In addition, various supporting data for operons are also collected through the application of public domain computer programs, and used in our prediction method. These include the prediction of conserved gene functions, promoter motifs and terminators. An apparent advantage of our approach over other operon prediction methods is that it does not require many experimental data (such as gene expression data and pathway data) as input. This feature makes it applicable to many newly sequenced genomes that do not have extensive experimental information. In order to validate our prediction, we have tested the method on Escherichia coli K12, in which operon structures have been extensively studied, through a comparative analysis against Haemophilus influenzae Rd and Salmonella typhimurium LT2. Our method successfully predicted most of the 237 known operons. After this initial validation, we then applied the method to a newly sequenced and annotated microbial genome, Synechococcus sp. WH8102, through a comparative genome analysis with two other cyanobacterial genomes, Prochlorococcus marinus sp. MED4 and P.marinus sp. MIT9313. Our results are consistent with previously reported results and statistics on operons in the literature.
引用
收藏
页码:2147 / 2157
页数:11
相关论文
共 25 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
Craven M, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P116
[3]   Prediction of transcription terminators in bacterial genomes [J].
Ermolaeva, MD ;
Khalak, HG ;
White, O ;
Smith, HO ;
Salzberg, SL .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 301 (01) :27-33
[4]   Prediction of operons in microbial genomes [J].
Ermolaeva, MD ;
White, O ;
Salzberg, SL .
NUCLEIC ACIDS RESEARCH, 2001, 29 (05) :1216-1221
[5]   Recognition of regulatory sites by genomic comparison [J].
Gelfand, MS .
RESEARCH IN MICROBIOLOGY, 1999, 150 (9-10) :755-771
[6]   NEW DEVELOPMENTS OF A TRANSCRIPTION FACTORS DATABASE [J].
GHOSH, D .
TRENDS IN BIOCHEMICAL SCIENCES, 1991, 16 (11) :445-447
[7]  
Hayashi T, 2001, DNA RES, V8, P11, DOI 10.1093/dnares/8.1.11
[8]  
HIGGINS CF, 1992, ANNU REV CELL BIOL, V8, P67, DOI 10.1146/annurev.cb.08.110192.000435
[9]  
Lewin B., 2000, Genes, VVII
[10]   Gene order is not conserved in bacterial evolution [J].
Mushegian, AR ;
Koonin, EV .
TRENDS IN GENETICS, 1996, 12 (08) :289-290