A Novel Computational Approach To Predict Transcription Factor DNA Binding Preference

被引:40
作者
Cai, Yudong [1 ]
He, JianFeng [2 ]
Li, XinLei [3 ,4 ]
Lu, Lin [2 ]
Yang, XinYi [1 ]
Feng, KayYan [5 ]
Lu, WenCong [6 ]
Kong, XiangYin [3 ,4 ]
机构
[1] Chinese Acad Sci, Shanghai Inst Biol Sci, CAS MPG Partner Inst Computat Biol, Shanghai 200031, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Biomed Engn, Shanghai 200040, Peoples R China
[3] Shanghai Jiao Tong Univ, Inst Hlth Sci, Shanghai 200025, Peoples R China
[4] Chinese Acad Sci, Shanghai Inst Biol Sci, Shanghai 200025, Peoples R China
[5] Univ Manchester, Div Imaging Sci & Biomed Engn, Manchester M13 9PT, Lancs, England
[6] Shanghai Univ, Coll Sci, Dept Chem, Lab Chem Data Min, Shanghai 200444, Peoples R China
关键词
Transcription factor; Transcription factor DNA binding preference; mRMR; Nearest neighbor algorithm; 0/1; System; Jackknife cross-validation test; PROTEIN; NETWORKS;
D O I
10.1021/pr800717y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Transcription is one of the most important processes in cell in which transcription factors translate DNA sequences into RNA sequences. Accurate prediction of DNA binding preference of transcription factors is valuable for understanding the transcription regulatory mechanism and(1) elucidating regulation network.(2-4) Here we predict the DNA binding preference of transcription factor based on the protein amino acid composition and physicochemical properties, 0/1 encoding system of nucleotide, minimum Redundancy Maximum Relevance Feature Selection method,(5) and Nearest Neighbor Algorithm. The overall prediction accuracy of Jackknife cross-validation test is 91.1%, indicating that this approach is a useful tool to explore the relation between transcription factor and its binding sites. Moreover, we find that the secondary structure and polarizability of transcriptor contribute mostly in the prediction. Especially, a 7-nt motif with AT-rich region of the DNA binding sites discovered via our method is also consistent with the statistical analysis from the TRANSFAC database.(6)
引用
收藏
页码:999 / 1003
页数:5
相关论文
共 23 条
[1]  
ALBERTS A, 2002, MOL BIOL CELL, P299
[2]   Predicting membrane protein type by functional domain composition and pseudo-amino acid composition [J].
Cai, YD ;
Chou, KC .
JOURNAL OF THEORETICAL BIOLOGY, 2006, 238 (02) :395-400
[3]   PREDICTION OF PROTEIN STRUCTURAL CLASSES [J].
CHOU, KC ;
ZHANG, CT .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) :275-349
[4]   A NOVEL-APPROACH TO PREDICTING PROTEIN STRUCTURAL CLASSES IN A (20-1)-D AMINO-ACID-COMPOSITION SPACE [J].
CHOU, KC .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1995, 21 (04) :319-344
[5]   Prediction of protein cellular attributes using pseudo-amino acid composition [J].
Chou, KC .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 43 (03) :246-255
[6]   A synthetic oscillatory network of transcriptional regulators [J].
Elowitz, MB ;
Leibler, S .
NATURE, 2000, 403 (6767) :335-338
[7]   A statistical analysis of the TRANSFAC database [J].
Fogel, GB ;
Weekes, DG ;
Varga, G ;
Dow, ER ;
Craven, AM ;
Harlow, HB ;
Su, EW ;
Onyia, JE ;
Su, C .
BIOSYSTEMS, 2005, 81 (02) :137-154
[8]   Context-specific independence mixture modeling for positional weight matrices [J].
Georgi, Benjamin ;
Schliep, Alexander .
BIOINFORMATICS, 2006, 22 (14) :E166-E173
[9]   Eukaryotic transcription factor binding sites - modeling and integrative search methods [J].
Hannenhalli, Sridhar .
BIOINFORMATICS, 2008, 24 (11) :1325-1331
[10]  
JIA P, 2006, BMC BIOINFORMATICS, P7