MoRAine - A web server for fast computational transcription factor binding motif re-annotation

被引:6
作者
Baumbach, Jan [1 ,2 ,3 ]
Wittkop, Tobias [1 ,2 ,4 ]
Weile, Jochen [1 ,2 ]
Kohl, Thomas [3 ,5 ]
Rahmann, Sven [1 ,6 ]
机构
[1] Bielefeld Univ, Computat Methods Emerging Technol, Bielefeld, Germany
[2] Bielefeld Univ, Genome Informat, Bielefeld, Germany
[3] Ctr Biotechnol, Int Grad Sch Bioinformat & Genome Res, Bielefeld, Germany
[4] Bielefeld Univ, DFG Graduiertenkolleg Bioinformat, Bielefeld, Germany
[5] Bielefeld Univ, Lehrstul Genet, Bielefeld, Germany
[6] TU Dortmund, Bioinformat High Throughput Technol, Dortmund, Germany
来源
JOURNAL OF INTEGRATIVE BIOINFORMATICS | 2008年 / 5卷 / 02期
关键词
D O I
10.2390/biecoll-jib-2008-91
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: A precise experimental identification of transcription factor binding motifs (TFBMs), accurate to a single base pair, is time-consuming and difficult. For several databases, TFBM annotations are extracted from the literature and stored 5' -> 3' relative to the target gene. Mixing the two possible orientations of a motif results in poor information content of subsequently computed position frequency matrices (PFMs) and sequence logos. Since these PFMs are used to predict further TFBMs, we address the question if the TFBMs underlying a PFM can be re-annotated automatically to improve both the information content of the PFM and subsequent classification performance. Results: We present MoRAine, an algorithm that re-annotates transcription factor binding motifs. Each motif with experimental evidence underlying a PFM is compared against each other such motif. The goal is to re-annotate TFBMs by possibly switching their strands and shifting them a few positions in order to maximize the information content of the resulting adjusted PFM. We present two heuristic strategies to perform this optimization and subsequently show that MoRAine significantly improves the corresponding sequence logos. Furthermore, we justify the method by evaluating specificity, sensitivity, true positive, and false positive rates of PFM-based TFBM predictions for E. coli using the original database motifs and the MoRAine-adjusted motifs. The classification performance is considerably increased if MoRAine is used as a preprocessing step. Conclusions: MoRAine is integrated into a publicly available web server and can be used online or downloaded as a stand-alone version from http://moraine.cebitec.uni-bielefeld.de.
引用
收藏
页数:14
相关论文
共 26 条
[1]   Evolutionary dynamics of prokaryotic transcriptional regulatory networks [J].
Babu, MM ;
Teichmann, SA ;
Aravind, L .
JOURNAL OF MOLECULAR BIOLOGY, 2006, 358 (02) :614-633
[2]   Structure and evolution of transcriptional regulatory networks [J].
Babu, MM ;
Luscombe, NM ;
Aravind, L ;
Gerstein, M ;
Teichmann, SA .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (03) :283-291
[3]   Evolution of transcription factors and the gene regulatory network in Escherichia coli [J].
Babu, MM ;
Teichmann, SA .
NUCLEIC ACIDS RESEARCH, 2003, 31 (04) :1234-1244
[4]   CoryneRegNet: An ontology-based data warehouse of corynebacterial transcription factors and regulatory networks [J].
Baumbach, J ;
Brinkrolf, K ;
Czaja, LF ;
Rahmann, S ;
Tauch, A .
BMC GENOMICS, 2006, 7 (1)
[5]  
BAUMBACH J, 2006, J INTEGRATIVE BIOINF, V3, P24
[6]   CoryneRegNet 4.0 - A reference database for corynebacterial gene regulatory networks [J].
Baumbach, Jan .
BMC BIOINFORMATICS, 2007, 8
[7]   CoryneRegNet 3.0 -: An interactive systems biology platform for the analysis of gene regulatory networks in corynebacteria and Escherichia coli [J].
Baumbach, Jan ;
Wittkop, Tobias ;
Rademacher, Katrin ;
Rahmann, Sven ;
Brinkrolf, Karina ;
Tauch, Andreas .
JOURNAL OF BIOTECHNOLOGY, 2007, 129 (02) :279-289
[8]   Fast index based algorithms and software for matching position specific scoring matrices [J].
Beckstette, Michael ;
Homann, Robert ;
Giegerich, Robert ;
Kurtz, Stefan .
BMC BIOINFORMATICS, 2006, 7 (1)
[9]   P-Match: transcription factor binding site search by combining patterns and weight matrices [J].
Chekmenev, DS ;
Haid, C ;
Kel, AE .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W432-W437
[10]   WebLogo: A sequence logo generator [J].
Crooks, GE ;
Hon, G ;
Chandonia, JM ;
Brenner, SE .
GENOME RESEARCH, 2004, 14 (06) :1188-1190