Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data

被引:24
作者
Huber, Bertrand R.
Bulyk, Martha L. [1 ]
机构
[1] Brigham & Womens Hosp, Dept Med, Div Genet, Boston, MA 02115 USA
[2] Harvard Univ, Sch Med, Boston, MA 02115 USA
[3] Brigham & Womens Hosp, Dept Pathol, Boston, MA 02115 USA
[4] Harvard Univ, Sch Med, Harvard Mit Div Hlth Sci & Technol, Boston, MA 02115 USA
关键词
D O I
10.1186/1471-2105-7-229
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: A key step in the regulation of gene expression is the sequence- specific binding of transcription factors (TFs) to their DNA recognition sites. However, elucidating TF binding site (TFBS) motifs in higher eukaryotes has been challenging, even when employing cross-species sequence conservation. We hypothesized that for human and mouse, many orthologous genes expressed in a similarly tissue-specific manner in both human and mouse gene expression data, are likely to be co-regulated by orthologous TFs that bind to DNA sequence motifs present within noncoding sequence conserved between these genomes. Results: We performed automated motif searching and merging across four different motif finding algorithms, followed by filtering of the resulting motifs for those that contain blocks of information content. Applying this motif finding strategy to conserved noncoding regions surrounding coexpressed tissue-specific human genes allowed us to discover both previously known, and many novel candidate, regulatory DNA motifs in all 18 tissue-specific expression clusters that we examined. For previously known TFBS motifs, we observed that if a TF was expressed in the specified tissue of interest, then in most cases we identified a motif that matched its TRANSFAC motif; conversely, of all those discovered motifs that matched TRANSFAC motifs, most of the corresponding TF transcripts were expressed in the tissue(s) corresponding to the expression cluster for which the motif was found. Conclusion: Our results indicate that the integration of the results from multiple motif finding tools identifies and ranks highly more known and novel motifs than does the use of just one of these tools. In addition, we believe that our simultaneous enrichment strategies helped to identify likely human cis regulatory elements. A number of the discovered motifs may correspond to novel binding site motifs for as yet uncharacterized tissue-specific TFs. We expect this strategy to be useful for identifying motifs in other metazoan genomes.
引用
收藏
页数:25
相关论文
共 65 条
[1]   Toucan:: deciphering the cis-regulatory logic of coregulated genes [J].
Aerts, S ;
Thijs, G ;
Coessens, B ;
Staes, M ;
Moreau, Y ;
Moor, BD .
NUCLEIC ACIDS RESEARCH, 2003, 31 (06) :1753-1764
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]  
Bailey T., 1994, P 2 INT C INT SYST M, P28
[4]   Dominant negative mutations in human PPARγ associated with severe insulin resistance, diabetes mellitus and hypertension [J].
Barroso, I ;
Gurnell, M ;
Crowley, VEF ;
Agostini, M ;
Schwabe, JW ;
Soos, MA ;
Maslen, GL ;
Williams, TDM ;
Lewis, H ;
Schafer, AJ ;
Chatterjee, VKK ;
O'Rahilly, S .
NATURE, 1999, 402 (6764) :880-883
[5]   SELECTION OF DNA-BINDING SITES BY REGULATORY PROTEINS - STATISTICAL-MECHANICAL THEORY AND APPLICATION TO OPERATORS AND PROMOTERS [J].
BERG, OG ;
VONHIPPEL, PH .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (04) :723-743
[6]   Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences [J].
Bergman, CM ;
Kreitman, M .
GENOME RESEARCH, 2001, 11 (08) :1335-1345
[7]  
Bulyk ML, 2004, GENOME BIOL, V5
[8]   Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysis [J].
Bussemaker, HJ ;
Li, H ;
Siggia, ED .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10096-10100
[9]   SGD:: Saccharomyces Genome Database [J].
Cherry, JM ;
Adler, C ;
Ball, C ;
Chervitz, SA ;
Dwight, SS ;
Hester, ET ;
Jia, YK ;
Juvik, G ;
Roe, T ;
Schroeder, M ;
Weng, SA ;
Botstein, D .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :73-79
[10]   Open source clustering software [J].
de Hoon, MJL ;
Imoto, S ;
Nolan, J ;
Miyano, S .
BIOINFORMATICS, 2004, 20 (09) :1453-1454