Identification of regulatory regions which confer muscle-specific gene expression

被引:308
作者
Wasserman, WW [1 ]
Fickett, JW [1 ]
机构
[1] SmithKline Beecham Pharmaceut, Bioinformat Res Grp, King Of Prussia, PA 19406 USA
关键词
muscle-specific expression; logistic regression analysis; position weight matrix; regulatory region prediction; phylogenetic footprinting;
D O I
10.1006/jmbi.1998.1700
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
For many newly sequenced genes, sequence analysis of the putative protein yields no clue on function. It would be beneficial to be able to identify in the genome the regulatory regions that confer temporal and spatial expression patterns for the uncharacterized genes. Additionally, it would be advantageous to identify regulatory regions within genes of known expression pattern without performing the costly and time consuming laboratory studies now required. To achieve these goals, the wealth of case studies performed over the past 15 years will have to be collected into predictive models of expression. Extensive studies of genes expressed in skeletal muscle have identified specific transcription factors which bind to regulatory elements to control gene expression. However, potential binding sites for these factors occur with sufficient frequency that it is rare for a gene to be found without one. Analysis of experimentally determined muscle regulatory sequences indicates that muscle expression requires multiple elements in close proximity. A model is generated with predictive capability for identifying these muscle-specific regulatory modules. Phylogenetic footprinting, the identification of sequences conserved between distantly related species, complements the statistical predictions. Through the use of logistic regression analysis, the model promises to be easily modified to take advantage of the elucidation of additional factors, cooperation rules, and spacing constraints. (C) 1998 Academic Press Limited.
引用
收藏
页码:167 / 181
页数:15
相关论文
共 71 条