The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons

被引:80
作者
Rajewsky, N [1 ]
Socci, ND [1 ]
Zapotocky, M [1 ]
Siggia, ED [1 ]
机构
[1] Rockefeller Univ, Ctr Studies Phys & Biol, New York, NY 10021 USA
关键词
D O I
10.1101/gr.207502. Article published online before print in January 2002
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The comparison of homologous noncoding DNA for organisms a suitable evolutionary distance apart is a powerful tool for the identification of cis regulatory elements for transcription and translation and for the study of how they assemble into functional modules. We have fit the three parameters of an affine global probabilistic alignment algorithm to establish the background mutation rate of noncoding seqeunce between E. colt and a series of gamma proteobacteria ranging from Salmonella to Vibrio. The lower bound we find to the neutral mutation rate is sufficiently high, even for Salmonella, that most of the conservation of noncoding sequence is indicative of selective pressures rather than of insufficient time to evolve. We then use a local version of the alignment algorithm combined with our inferred background mutation rate to assign a significance to the degree of locale sequence conservation between orthologous genes, and thereby deduce a probability profile for the upstream regulatory region of all E. colt protein-coding genes. We recover 75%-85% (depending on significance level) of all regulatory sites from a standard compilation for E. coli, and 66%-85% of sigma sites. We also trace the evolution of known regulatory sites and the groups associated with a given transcription factor. Furthermore, we find that approximately one-third of paralogous gene pairs in E. coli have a significant degree of correlation in their regulatory sequence. Finally, we demonstrate an inverse correlation between the rate of evolution of transcription factors and the number of genes they regulate. Our predictions are available at http:/ /www.physics.rockefeller.edti/-siggia.
引用
收藏
页码:298 / 308
页数:11
相关论文
共 36 条
[1]  
Blanchette M, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P37
[2]   Predicting gene regulatory elements in silico on a genomic scale [J].
Brazma, A ;
Jonassen, I ;
Vilo, J ;
Ukkonen, E .
GENOME RESEARCH, 1998, 8 (11) :1202-1215
[3]   Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysis [J].
Bussemaker, HJ ;
Li, H ;
Siggia, ED .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10096-10100
[4]   Regulatory element detection using correlation with expression [J].
Bussemaker, HJ ;
Li, H ;
Siggia, ED .
NATURE GENETICS, 2001, 27 (02) :167-171
[5]  
DURBIN R, 1998, BIOL SEQUENCE ANAL, P134
[6]   Identification of an UP element consensus sequence for bacterial promoters [J].
Estrem, ST ;
Gaal, T ;
Ross, W ;
Gourse, RL .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (17) :9761-9766
[7]   Prediction of transcription regulatory sites in Archaea by a comparative genomic approach [J].
Gelfand, MS ;
Koonin, EV ;
Mironov, AA .
NUCLEIC ACIDS RESEARCH, 2000, 28 (03) :695-705
[8]  
GRALLA JD, 1996, ORG FUNCTION TRANSCR, P1232
[9]  
HARDISON R, 1997, GENOME RES, V10, P959
[10]  
Higgins DG, 1996, METHOD ENZYMOL, V266, P383