Fine-Tuning Enhancer Models to Predict Transcriptional Targets across Multiple Genomes

被引:23
作者
Aerts, Stein [1 ,2 ]
van Helden, Jacques [3 ]
Sand, Olivier [3 ]
Hassan, Bassem A. [1 ,2 ]
机构
[1] Vlaams Inst Biotechnol VIB, Dept Mol & Dev Genet, Neurogenet Lab, Leuven, Belgium
[2] KU Leuven Sch Med, Dept Human Genet, Leuven, Belgium
[3] Univ Libre Bruxelles, Dept Mol Biol, Serv Conformat Macromol Biol & Bioinformat, Brussels, Belgium
来源
PLOS ONE | 2007年 / 2卷 / 11期
关键词
D O I
10.1371/journal.pone.0001115
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Networks of regulatory relations between transcription factors (TF) and their target genes (TG)-implemented through TF binding sites (TFBS)- are key features of biology. An idealized approach to solving such networks consists of starting from a consensus TFBS or a position weight matrix (PWM) to generate a high accuracy list of candidate TGs for biological validation. Developing and evaluating such approaches remains a formidable challenge in regulatory bioinformatics. We perform a benchmark study on 34 Drosophila TFs to assess existing TFBS and cis-regulatory module (CRM) detection methods, with a strong focus on the use of multiple genomes. Particularly, for CRM-modelling we investigate the addition of orthologous sites to a known PWM to construct phyloPWMs and we assess the added value of phylogenentic footprinting to predict contextual motifs around known TFBSs. For CRM-prediction, we compare motif conservation with network-level conservation approaches across multiple genomes. Choosing the optimal training and scoring strategies strongly enhances the performance of TG prediction for more than half of the tested TFs. Finally, we analyse a 35(th) TF, namely Eyeless, and find a significant overlap between predicted TGs and candidate TGs identified by microarray expression studies. In summary we identify several ways to optimize TF-specific TG predictions, some of which can be applied to all TFs, and others that can be applied only to particular TFs. The ability to model known TF-TG relations, together with the use of multiple genomes, results in a significant step forward in solving the architecture of gene regulatory networks.
引用
收藏
页数:11
相关论文
共 51 条
  • [1] FlyTF:: a systematic review of site-specific transcription factors in the fruit fly Drosophila melanogaster
    Adryan, Boris
    Teichmann, Sarah A.
    [J]. BIOINFORMATICS, 2006, 22 (12) : 1532 - 1533
  • [2] Gene prioritization through genomic data fusion
    Aerts, S
    Lambrechts, D
    Maity, S
    Van Loo, P
    Coessens, B
    De Smet, F
    Tranchevent, LC
    De Moor, B
    Marynen, P
    Hassan, B
    Carmeliet, P
    Moreau, Y
    [J]. NATURE BIOTECHNOLOGY, 2006, 24 (05) : 537 - 544
  • [3] Toucan:: deciphering the cis-regulatory logic of coregulated genes
    Aerts, S
    Thijs, G
    Coessens, B
    Staes, M
    Moreau, Y
    Moor, BD
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (06) : 1753 - 1764
  • [4] Searching for statistically significant regulatory modules
    Bailey, Timothy L.
    Noble, William Stafford
    [J]. BIOINFORMATICS, 2003, 19 : II16 - II25
  • [5] Drosophila DNase I footprint database:: a systematic genome annotation of transcription factor binding sites in the fruitfly, Drosophila melanogaster
    Bergman, CM
    Carlson, JW
    Celniker, SE
    [J]. BIOINFORMATICS, 2005, 21 (08) : 1747 - 1749
  • [6] Computational identification of developmental enhancers:: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura -: art. no. R61
    Berman, BP
    Pfeiffer, BD
    Laverty, TR
    Salzberg, SL
    Rubin, GM
    Eisen, MB
    Celniker, SE
    [J]. GENOME BIOLOGY, 2004, 5 (09)
  • [7] Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression
    Blanchette, M
    Bataille, AR
    Chen, XY
    Poitras, C
    Laganière, J
    Lefèbvre, C
    Deblois, G
    Giguère, V
    Ferretti, V
    Bergeron, D
    Coulombe, B
    Robert, FO
    [J]. GENOME RESEARCH, 2006, 16 (05) : 656 - 668
  • [8] GO::TermFinder - open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes
    Boyle, EI
    Weng, SA
    Gollub, J
    Jin, H
    Botstein, D
    Cherry, JM
    Sherlock, G
    [J]. BIOINFORMATICS, 2004, 20 (18) : 3710 - 3715
  • [9] LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA
    Brudno, M
    Do, CB
    Cooper, GM
    Kim, MF
    Davydov, E
    Green, ED
    Sidow, A
    Batzoglou, S
    [J]. GENOME RESEARCH, 2003, 13 (04) : 721 - 731
  • [10] Using hexamers to predict cis-regulatory motifs in Drosophila
    Chan, BY
    Kibler, D
    [J]. BMC BIOINFORMATICS, 2005, 6 (1)