Evolutionary computation for discovery of composite transcription factor binding sites

被引:21
作者
Fogel, Gary B. [2 ]
Porto, V. William [2 ]
Varga, Gabor [1 ]
Dow, Ernst R. [1 ]
Craven, Andrew M. [1 ]
Powers, David M. [1 ]
Harlow, Harry B. [1 ]
Su, Eric W. [1 ]
Onyia, Jude E. [1 ]
Su, Chen [1 ]
机构
[1] Eli Lilly & Co, Lilly Res Labs, Indianapolis, IN 46285 USA
[2] Nat Select Inc, San Diego, CA 92121 USA
关键词
D O I
10.1093/nar/gkn738
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Previous research demonstrated the use of evolutionary computation for the discovery of transcription factor binding sites (TFBS) in promoter regions upstream of coexpressed genes. However, it remained unclear whether or not composite TFBS elements, commonly found in higher organisms where two or more TFBSs form functional complexes, could also be identified by using this approach. Here, we present an important refinement of our previous algorithm and test the identification of composite elements using NFAT/AP-1 as an example. We demonstrate that by using appropriate existing parameters such as window size, novel-scoring methods such as central bonusing and methods of self-adaptation to automatically adjust the variation operators during the evolutionary search, TFBSs of different sizes and complexity can be identified as top solutions. Some of these solutions have known experimental relationships with NFAT/AP-1. We also indicate that even after properly tuning the model parameters, the choice of the appropriate window size has a significant effect on algorithm performance. We believe that this improved algorithm will greatly augment TFBS discovery.
引用
收藏
页数:14
相关论文
共 42 条
[1]  
Atteson K, 1998, Proc Int Conf Intell Syst Mol Biol, V6, P17
[2]  
BAILEY TL, 1995, MACH LEARN, V21, P51, DOI 10.1007/BF00993379
[3]   Gene expression data analysis [J].
Brazma, A ;
Vilo, J .
FEBS LETTERS, 2000, 480 (01) :17-24
[4]   Predicting gene regulatory elements in silico on a genomic scale [J].
Brazma, A ;
Jonassen, I ;
Vilo, J ;
Ukkonen, E .
GENOME RESEARCH, 1998, 8 (11) :1202-1215
[5]   Regulatory elements and expression profiles [J].
Bucher, P .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1999, 9 (03) :400-407
[6]  
CARLOS RR, 2006, P 5 MEX INT C ART IN
[7]   TFBS identification based on genetic algorithm with combined representations and adaptive post-processing [J].
Chan, Tak-Ming ;
Leung, Kwong-Sak ;
Lee, Kin-Hong .
BIOINFORMATICS, 2008, 24 (03) :341-349
[8]   MULTIPLE CLOSELY-LINKED NFAT-OCTAMER AND HMG I(Y) BINDING-SITES ARE PART OF THE INTERLEUKIN-4 PROMOTER [J].
CHUVPILO, S ;
SCHOMBERG, C ;
GERWIG, R ;
HEINFLING, A ;
REEVES, R ;
GRUMMT, F ;
SERFLING, E .
NUCLEIC ACIDS RESEARCH, 1993, 21 (24) :5694-5704
[9]   An evaluation of information content as a metric for the inference of putative conserved noncoding regions in DNA sequences using a genetic algorithms approach [J].
Congdon, Clare Bates ;
Aman, Joseph C. ;
Nava, Gerardo M. ;
Gaskins, H. Rex ;
Mattingly, Carolyn J. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2008, 5 (01) :1-14
[10]   Seeder: discriminative seeding DNA motif discovery [J].
Fauteux, Francois ;
Blanchette, Mathieu ;
Stromvik, Martina V. .
BIOINFORMATICS, 2008, 24 (20) :2303-2307