Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information

被引：289

作者：

Ahmad, S ^{[1
]}

Gromiha, MM

Sarai, A

机构：

[1] Kyushu Inst Technol, Dept Biochem Sci & Engn, Iizuka, Fukuoka 8208502, Japan

[2] Jamia Millia Islamia, Dept Biosci, New Delhi 110025, India

[3] AIST, Computat Biol Res Ctr, CBRC, Koto Ku, Tokyo 1350064, Japan

来源：

BIOINFORMATICS | 2004年 / 20卷 / 04期

关键词：

D O I：

10.1093/bioinformatics/btg432

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: Though vitally important to cell function, the mechanism of protein-DNA binding has not yet been completely understood. We therefore analysed the relationship between DNA binding and protein sequence composition, solvent accessibility and secondary structure. Using non-redundant databases of transcription factors and protein-DNA complexes, neural network models were developed to utilize the information present in this relationship to predict DNA-binding proteins and their binding residues. Results: Sequence composition was found to provide sufficient information to predict the probability of its binding to DNA with nearly 69% sensitivity at 64% accuracy for the considered proteins; sequence neighbourhood and solvent accessibility information were sufficient to make binding site predictions with 40% sensitivity at 79% accuracy. Detailed analysis of binding residues shows that some three- and five-residue segments frequently bind to DNA and that solvent accessibility plays a major role in binding. Although, binding behaviour was not associated with any particular secondary structure, there were interesting exceptions at the residue level. Over-representation of some residues in the binding sites was largely lost at the total sequence level, but a different kind of compositional preference was observed in DNA-binding proteins.

引用

页码：477 / 486

页数：10

共 20 条

[1] Real value prediction of solvent accessibility from amino acid sequence
Ahmad, S
Gromiha, MM
Sarai, A
[J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 50 (04) : 629 - 635
[2] NETASA: neural network based prediction of solvent accessibility
Ahmad, S
Gromiha, MM
[J]. BIOINFORMATICS, 2002, 18 (06) : 819 - 824
[3] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Altschul, SF
Madden, TL
Schaffer, AA
Zhang, JH
Zhang, Z
Miller, W
Lipman, DJ
[J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
[4] The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003
Boeckmann, B
Bairoch, A
Apweiler, R
Blatter, MC
Estreicher, A
Gasteiger, E
Martin, MJ
Michoud, K
O'Donovan, C
Phan, I
Pilbout, S
Schneider, M
[J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 365 - 370
[5] Cuff JA, 2000, PROTEINS, V40, P502, DOI 10.1002/1097-0134(20000815)40:3<502::AID-PROT170>3.0.CO
[6] 2-Q
[7] Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: Application of long-range order to folding rate prediction
Gromiha, MM
Selvaraj, S
[J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 310 (01) : 27 - 32
[8] Role of structural and sequence information in the prediction of protein stability changes: comparison between buried and partially buried mutations
Gromiha, MM
Oobatake, M
Kono, H
Uedaira, H
Sarai, A
[J]. PROTEIN ENGINEERING, 1999, 12 (07): : 549 - 555
[9] Removing near-neighbour redundancy from large protein sequence collections
Holm, L
Sander, C
[J]. BIOINFORMATICS, 1998, 14 (05) : 423 - 429
[10] DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES
KABSCH, W
SANDER, C
[J]. BIOPOLYMERS, 1983, 22 (12) : 2577 - 2637

← 1 2 →