Cis-motifs upstream of the transcription and translation initiation sites are effectively revealed by their positional disequilibrium in eukaryote genomes using frequency distribution curves

被引:48
作者
Berendzen, Kenneth W.
Stueber, Kurt
Harter, Klaus
Wanke, Dierk
机构
[1] Max Planck Inst Plant Breeding Res, D-50829 Cologne, Germany
[2] Univ Tubingen, ZMBP Pflanzenphysiol, D-72076 Tubingen, Germany
关键词
D O I
10.1186/1471-2105-7-522
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The discovery of cis- regulatory motifs still remains a challenging task even though the number of sequenced genomes is constantly growing. Computational analyses using pattern search algorithms have been valuable in phylogenetic footprinting approaches as have expression profile experiments to predict co-occurring motifs. Surprisingly little is known about the nature of cis- regulatory element (CRE) distribution in promoters. Results: In this paper we used the Motif Mapper open-source collection of visual basic scripts for the analysis of motifs in any aligned set of DNA sequences. We focused on promoter motif distribution curves to identify positional over-representation of DNA motifs. Using differentially aligned datasets from the model species Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster and Saccharomyces cerevisiae, we convincingly demonstrated the importance of the position and orientation for motif discovery. Analysis with known CREs and all possible hexanucleotides showed that some functional elements gather close to the transcription and translation initiation sites and that elements other than the TATA- box motif are conserved between eukaryote promoters. While a high background frequency usually decreases the effectiveness of such an enumerative investigation, we improved our analysis by conducting motif distribution maps using large datasets. Conclusion: This is the first study to reveal positional over- representation of CREs and promoter motifs in a cross-species approach. CREs and motifs shared between eukaryotic promoters support the observation that an eukaryotic promoter structure has been conserved throughout evolutionary time. Furthermore, with the information on positional enrichment of a motif or a known functional CRE, it is possible to get a more detailed insight into where an element appears to function. This in turn might accelerate the in depth examination of known and yet unknown cis-regulatory sequences in the laboratory.
引用
收藏
页数:19
相关论文
共 54 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
ARKHIPOVA IR, 1995, GENETICS, V139, P1359
[3]   Predicting gene expression from sequence [J].
Beer, MA ;
Tavazoie, S .
CELL, 2004, 117 (02) :185-198
[4]   Identification of transcription factor binding sites with variable-order Bayesian networks [J].
Ben-Gal, I ;
Shani, A ;
Gohr, A ;
Grau, J ;
Arviv, S ;
Shmilovici, A ;
Posch, S ;
Grosse, I .
BIOINFORMATICS, 2005, 21 (11) :2657-2666
[5]   Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome [J].
Berman, BP ;
Nibu, Y ;
Pfeiffer, BD ;
Tomancak, P ;
Celniker, SE ;
Levine, M ;
Rubin, GM ;
Eisen, MB .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (02) :757-762
[6]   The PIN auxin efflux facilitator network controls growth and patterning in Arabidopsis roots [J].
Blilou, I ;
Xu, J ;
Wildwater, M ;
Willemsen, V ;
Paponov, I ;
Friml, J ;
Heidstra, R ;
Aida, M ;
Palme, K ;
Scheres, B .
NATURE, 2005, 433 (7021) :39-44
[7]   A global analysis of Caenorhabditis elegans operons [J].
Blumenthal, T ;
Evans, D ;
Link, CD ;
Guffanti, A ;
Lawson, D ;
Thierry-Mieg, J ;
Thierry-Mieg, D ;
Chiu, WL ;
Duke, K ;
Kiraly, M ;
Kim, SK .
NATURE, 2002, 417 (6891) :851-854
[8]   The noncanonical binding site of the MED-1 GATA factor defines differentially regulated target genes in the C-elegans mesendoderm [J].
Broitman-Maduro, G ;
Maduro, MF ;
Rothman, JH .
DEVELOPMENTAL CELL, 2005, 8 (03) :427-433
[9]   WEIGHT MATRIX DESCRIPTIONS OF 4 EUKARYOTIC RNA POLYMERASE-II PROMOTER ELEMENTS DERIVED FROM 502 UNRELATED PROMOTER SEQUENCES [J].
BUCHER, P .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 212 (04) :563-578
[10]   The transcriptional landscape of the mammalian genome [J].
Carninci, P ;
Kasukawa, T ;
Katayama, S ;
Gough, J ;
Frith, MC ;
Maeda, N ;
Oyama, R ;
Ravasi, T ;
Lenhard, B ;
Wells, C ;
Kodzius, R ;
Shimokawa, K ;
Bajic, VB ;
Brenner, SE ;
Batalov, S ;
Forrest, ARR ;
Zavolan, M ;
Davis, MJ ;
Wilming, LG ;
Aidinis, V ;
Allen, JE ;
Ambesi-Impiombato, X ;
Apweiler, R ;
Aturaliya, RN ;
Bailey, TL ;
Bansal, M ;
Baxter, L ;
Beisel, KW ;
Bersano, T ;
Bono, H ;
Chalk, AM ;
Chiu, KP ;
Choudhary, V ;
Christoffels, A ;
Clutterbuck, DR ;
Crowe, ML ;
Dalla, E ;
Dalrymple, BP ;
de Bono, B ;
Della Gatta, G ;
di Bernardo, D ;
Down, T ;
Engstrom, P ;
Fagiolini, M ;
Faulkner, G ;
Fletcher, CF ;
Fukushima, T ;
Furuno, M ;
Futaki, S ;
Gariboldi, M .
SCIENCE, 2005, 309 (5740) :1559-1563