Analysis of computational approaches for motif discovery

被引:47
作者
Li, Nan [1 ]
Tompa, Martin [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
10.1186/1748-7188-1-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recently, we performed an assessment of 13 popular computational tools for discovery of transcription factor binding sites (M. Tompa, N. Li, et al., "Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites", Nature Biotechnology, Jan. 2005). This paper contains follow-up analysis of the assessment results, and raises and discusses some important issues concerning the state of the art in motif discovery methods: 1. We categorize the objective functions used by existing tools, and design experiments to evaluate whether any of these objective functions is the right one to optimize. 2. We examine various features of the data sets that were used in the assessment, such as sequence length and motif degeneracy, and identify which features make data sets hard for current motif discovery tools. 3. We identify an important feature that has not yet been used by existing tools and propose a new objective function that incorporates this feature.
引用
收藏
页数:8
相关论文
共 12 条
  • [1] BAILEY TL, 1995, P 3 INT C INT SYST M, P21
  • [2] AN ANALYSIS OF TRANSFORMATIONS
    BOX, GEP
    COX, DR
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) : 211 - 252
  • [3] Evaluation of gene structure prediction programs
    Burset, M
    Guigo, R
    [J]. GENOMICS, 1996, 34 (03) : 353 - 367
  • [4] Identifying DNA and protein patterns with statistically significant alignments of multiple sequences
    Hertz, GZ
    Stormo, GD
    [J]. BIOINFORMATICS, 1999, 15 (7-8) : 563 - 577
  • [5] Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae
    Hughes, JD
    Estep, PW
    Tavazoie, S
    Church, GM
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2000, 296 (05) : 1205 - 1214
  • [6] TRANSFAC®:: transcriptional regulation, from patterns to profiles
    Matys, V
    Fricke, E
    Geffers, R
    Gössling, E
    Haubrock, M
    Hehl, R
    Hornischer, K
    Karas, D
    Kel, AE
    Kel-Margoulis, OV
    Kloos, DU
    Land, S
    Lewicki-Potapov, B
    Michael, H
    Münch, R
    Reuter, I
    Rotert, S
    Saxel, H
    Scheer, M
    Thiele, S
    Wingender, E
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 374 - 378
  • [7] McCullagh P., 2018, Generalized Linear Models
  • [8] NEAL DK, 1996, MATH ED RES, V5, P23
  • [9] Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes
    Pavesi, G
    Mereghetti, P
    Mauri, G
    Pesole, G
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : W199 - W203
  • [10] YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation
    Sinha, S
    Tompa, M
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (13) : 3586 - 3588