RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets

被引:163
作者
Thomas-Chollier, Morgane [3 ]
Herrmann, Carl [1 ,2 ]
Defrance, Matthieu [4 ]
Sand, Olivier [5 ]
Thieffry, Denis [1 ,2 ,6 ,7 ,8 ]
van Helden, Jacques [1 ,2 ,9 ]
机构
[1] INSERM, U928, TAGC, F-13288 Marseille, France
[2] Univ Mediterranee, F-13288 Marseille, France
[3] Max Planck Inst Mol Genet, Dept Computat Mol Biol, D-14195 Berlin, Germany
[4] Univ Nacl Autonoma Mexico, Ctr Ciencias Genom, Cuernavaca 62210, Morelos, Mexico
[5] CNRS, UMR8199, Inst Biol Lille, F-59000 Lille, France
[6] Ecole Normale Super, UMR ENS, Inst Biol, F-75005 Paris, France
[7] CNRS 8197, F-75005 Paris, France
[8] INSERM 1024, F-75005 Paris, France
[9] Univ Libre Bruxelles, Lab Bioinformat Genomes & Reseaux BiGRe, B-1050 Brussels, Belgium
关键词
FACTOR-BINDING SITES; COMPUTATIONAL ANALYSIS; SEQUENCES; IDENTIFICATION; DATABASE; DISCOVERY; REGION; GENES; P300;
D O I
10.1093/nar/gkr1104
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time-and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1 28 000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks.
引用
收藏
页数:9
相关论文
共 32 条
[1]   High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions [J].
Agius, Phaedra ;
Arvey, Aaron ;
Chang, William ;
Noble, William Stafford ;
Leslie, Christina .
PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (09)
[2]   DREME: motif discovery in transcription factor ChIP-seq data [J].
Bailey, Timothy L. .
BIOINFORMATICS, 2011, 27 (12) :1653-1659
[3]   Cooperation of Spl and p300 in the induction of the CDK inhibitor p21WAF1/CIP1 during NGF-mediated neuronal differentiation [J].
Billon, N ;
Carlisi, D ;
Datto, MB ;
van Grunsven, LA ;
Watt, A ;
Wang, XF ;
Rudkin, BB .
ONCOGENE, 1999, 18 (18) :2872-2882
[4]   ChIP-Seq identification of weakly conserved heart enhancers [J].
Blow, Matthew J. ;
McCulley, David J. ;
Li, Zirong ;
Zhang, Tao ;
Akiyama, Jennifer A. ;
Holt, Amy ;
Plajzer-Frick, Ingrid ;
Shoukry, Malak ;
Wright, Crystal ;
Chen, Feng ;
Afzal, Veena ;
Bristow, James ;
Ren, Bing ;
Black, Brian L. ;
Rubin, Edward M. ;
Visel, Axel ;
Pennacchio, Len A. .
NATURE GENETICS, 2010, 42 (09) :806-U107
[5]   De novo motif identification improves the accuracy of predicting transcription factor binding sites in ChIP-Seq data analysis [J].
Boeva, Valentina ;
Surdez, Didier ;
Guillon, Noelle ;
Tirode, Franck ;
Fejes, Anthony P. ;
Delattre, Olivier ;
Barillot, Emmanuel .
NUCLEIC ACIDS RESEARCH, 2010, 38 (11) :e126-e126
[6]   Integration of external signaling pathways with the core transcriptional network in embryonic stem cells [J].
Chen, Xi ;
Xu, Han ;
Yuan, Ping ;
Fang, Fang ;
Huss, Mikael ;
Vega, Vinsensius B. ;
Wong, Eleanor ;
Orlov, Yuriy L. ;
Zhang, Weiwei ;
Jiang, Jianming ;
Loh, Yuin-Han ;
Yeo, Hock Chuan ;
Yeo, Zhen Xuan ;
Narang, Vipin ;
Govindarajan, Kunde Ramamoorthy ;
Leong, Bernard ;
Shahab, Atif ;
Ruan, Yijun ;
Bourque, Guillaume ;
Sung, Wing-Kin ;
Clarke, Neil D. ;
Wei, Chia-Lin ;
Ng, Huck-Hui .
CELL, 2008, 133 (06) :1106-1117
[7]  
DUBOULE D, 1993, ANN GENET-PARIS, V36, P24
[8]   RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units) [J].
Gama-Castro, Socorro ;
Salgado, Heladia ;
Peralta-Gil, Martin ;
Santos-Zavaleta, Alberto ;
Muniz-Rascado, Luis ;
Solano-Lira, Hilda ;
Jimenez-Jacinto, Veronica ;
Weiss, Verena ;
Garcia-Sotelo, Jair S. ;
Lopez-Fuentes, Alejandra ;
Porron-Sotelo, Liliana ;
Alquicira-Hernandez, Shirley ;
Medina-Rivera, Alejandra ;
Martinez-Flores, Irma ;
Alquicira-Hernandez, Kevin ;
Martinez-Adame, Ruth ;
Bonavides-Martinez, Cesar ;
Miranda-Rios, Juan ;
Huerta, Araceli M. ;
Mendoza-Vargas, Alfredo ;
Collado-Torres, Leonardo ;
Taboada, Blanca ;
Vega-Alvarado, Leticia ;
Olvera, Maricela ;
Olvera, Leticia ;
Grande, Ricardo ;
Morett, Enrique ;
Collado-Vides, Julio .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D98-D105
[9]   A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data [J].
He, Xin ;
Chen, Chieh-Chun ;
Hong, Feng ;
Fang, Fang ;
Sinha, Saurabh ;
Ng, Huck-Hui ;
Zhong, Sheng .
PLOS ONE, 2009, 4 (12)
[10]   On the detection and refinement of transcription factor binding sites using ChIP-Seq data [J].
Hu, Ming ;
Yu, Jindan ;
Taylor, Jeremy M. G. ;
Chinnaiyan, Arul M. ;
Qin, Zhaohui S. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (07) :2154-2167