Bioinformatic principles underlying the information content of transcription factor binding sites

被引:17
作者
Kim, JT [1 ]
Martinetz, T [1 ]
Polani, D [1 ]
机构
[1] Inst Neuro & Bioinformat, D-23569 Lubeck, Germany
关键词
D O I
10.1006/jtbi.2003.3153
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Empirically, it has been observed in several cases that the information content of transcription factor binding site sequences (R-sequence) approximately equals the information content of binding site positions (R-frequency). A general framework for formal models of transcription factors and binding sites is developed to address this issue. Measures for information content in transcription factor binding sites are revisited and theoretic analyses are compared on this basis. These analyses do not lead to consistent results. A comparative review reveals that these inconsistent approaches do not include a transcription factor-state space. Therefore, a state space for mathematically representing transcription factors with respect to their binding site recognition properties is introduced into the modelling framework. Analysis of the resulting comprehensive model shows that the structure of genome state space favours equality of R-sequence and R-frequency indeed, but the relation between the two information quantities also depends on the structure of the transcription factor state space. This might lead to significant deviations between R-sequence and R-frequency. However, further investigation and biological arguments show that the effects of the structure of the transcription factor state space on the relation of R-sequence and R-frequency are strongly limited for systems which are autonomous in the sense that all DNA-binding proteins operating on the genome are encoded in the genome itself. This provides a theoretical explanation for the empirically observed equality. (C) 2003 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:529 / 544
页数:16
相关论文
共 16 条
[1]   SELECTION OF DNA-BINDING SITES BY REGULATORY PROTEINS - STATISTICAL-MECHANICAL THEORY AND APPLICATION TO OPERATORS AND PROMOTERS [J].
BERG, OG ;
VONHIPPEL, PH .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (04) :723-743
[2]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[3]  
Frech K, 1997, COMPUT APPL BIOSCI, V13, P89
[4]   INFORMATION THEORY AND STATISTICAL MECHANICS [J].
JAYNES, ET .
PHYSICAL REVIEW, 1957, 106 (04) :620-630
[5]   EVOLUTION OF A REGULATORY GENE FAMILY - HOM/HOX GENES [J].
KAPPEN, C ;
RUDDLE, FH .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 1993, 3 (06) :931-938
[6]   Transcription regulatory regions database (TRRD): its status in 2000 [J].
Kolchanov, NA ;
Podkolodnaya, OA ;
Ananko, EA ;
Ignatieva, EV ;
Stepanenko, IL ;
Kel-Margoulis, OV ;
Kel, AE ;
Merkulova, TI ;
Goryachkovskaya, TN ;
Busygina, TV ;
Kolpakov, FA ;
Podkolodny, NL ;
Naumochkin, AN ;
Korostishevskaya, IM ;
Romashchenko, AG ;
Overton, GC .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :298-301
[7]   TRANSCRIPTION FACTORS - STRUCTURAL FAMILIES AND PRINCIPLES OF DNA RECOGNITION [J].
PABO, CO ;
SAUER, RT .
ANNUAL REVIEW OF BIOCHEMISTRY, 1992, 61 :1053-1095
[8]   Estimating the entropy of DNA sequences [J].
Schmitt, AO ;
Herzel, H .
JOURNAL OF THEORETICAL BIOLOGY, 1997, 188 (03) :369-377
[9]   INFORMATION-CONTENT OF BINDING-SITES ON NUCLEOTIDE-SEQUENCES [J].
SCHNEIDER, TD ;
STORMO, GD ;
GOLD, L ;
EHRENFEUCHT, A .
JOURNAL OF MOLECULAR BIOLOGY, 1986, 188 (03) :415-431
[10]   Evolution of biological information [J].
Schneider, TD .
NUCLEIC ACIDS RESEARCH, 2000, 28 (14) :2794-2799