Probabilistic code for DNA recognition by proteins of the EGR family

被引:87
作者
Benos, PV
Lapedes, AS
Stormo, GD
机构
[1] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63110 USA
[2] Los Alamos Natl Lab, Div Theoret, Los Alamos, NM 87545 USA
基金
美国国家卫生研究院;
关键词
DNA-protein interactions; recognition code; DNA-binding specificity; zinc-finger proteins;
D O I
10.1016/S0022-2836(02)00917-8
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A recognition code for protein-DNA interactions would allow for the prediction of binding sites based on protein sequence, and the identification of binding proteins for specific DNA targets. Crystallographic studies of protein-DNA complexes showed that a simple, deterministic recognition code does not exist. Here, we present a probabilistic recognition code (P-code) that assigns energies to all possible base-pair-amino acid interactions for the early growth response factor (EGR) family of zinc-finger transcription factors. The specific energy values are determined by a maximum likelihood method using examples from in vitro randomisation experiments (namely, SELEX and phage display) reported in the literature. The accuracy of the model is tested in several ways, including the ability to predict in vivo binding sites of EGR proteins and other non-EGR zinc-finger proteins, and the correlation between predicted and measured binding affinities of various EGR proteins to several different DNA sites. We also show that this model improves significantly upon the prediction capabilities of previous qualitative and quantitative models. The probabilistic code we develop uses information about the interacting positions between the protein and DNA, but we show that such information is not necessary, although it reduces the number of parameters to be determined. We also employ the assumption that the total binding energy is the sum of the energies of the individual contacts, but we describe how that assumption can be relaxed at the cost of additional parameters. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:701 / 727
页数:27
相关论文
共 51 条
[21]   A zinc finger directory for high-affinity DNA recognition [J].
Jamieson, AC ;
Wang, HM ;
Kim, SH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (23) :12834-12839
[22]  
Kono H, 1999, PROTEINS, V35, P114, DOI 10.1002/(SICI)1097-0134(19990401)35:1<114::AID-PROT11>3.0.CO
[23]  
2-T
[24]   DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT [J].
LAWRENCE, CE ;
ALTSCHUL, SF ;
BOGUSKI, MS ;
LIU, JS ;
NEUWALD, AF ;
WOOTTON, JC .
SCIENCE, 1993, 262 (5131) :208-214
[25]   Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level [J].
Luscombe, NM ;
Laskowski, RA ;
Thornton, JM .
NUCLEIC ACIDS RESEARCH, 2001, 29 (13) :2860-2874
[26]  
Lutfiyya LL, 1998, GENETICS, V150, P1377
[27]   Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay [J].
Man, TK ;
Stormo, GD .
NUCLEIC ACIDS RESEARCH, 2001, 29 (12) :2471-2478
[28]   Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites [J].
Mandel-Gutfreund, Y ;
Margalit, H .
NUCLEIC ACIDS RESEARCH, 1998, 26 (10) :2306-2312
[29]  
Mandel-Gutfreund Y, 2001, Pac Symp Biocomput, P139
[30]   COMPREHENSIVE ANALYSIS OF HYDROGEN-BONDS IN REGULATORY PROTEIN DNA-COMPLEXES - IN SEARCH OF COMMON PRINCIPLES [J].
MANDELGUTFREUND, Y ;
SCHUELER, O ;
MARGALIT, H .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 253 (02) :370-382