An Integrated Pipeline for the Genome-Wide Analysis of Transcription Factor Binding Sites from ChIP-Seq

被引:29
作者
Mercier, Eloi [1 ]
Droit, Arnaud [1 ,2 ]
Li, Leping [3 ]
Robertson, Gordon [4 ]
Zhang, Xuekui [5 ]
Gottardo, Raphael [6 ]
机构
[1] Inst Rech Clin Montreal, Computat Biol Unit, Montreal, PQ H2W 1R7, Canada
[2] Univ Laval, Dept Mol Med, Fac Med Endocrinol & Genom, Ctr Rech,CHUQ CRCHUQ, Quebec City, PQ, Canada
[3] NIEHS, Biostat Branch, NIH, Res Triangle Pk, NC 27709 USA
[4] BC Canc Agcy, Genome Sci Ctr, Vancouver, BC, Canada
[5] Univ British Columbia, Dept Stat, Vancouver, BC V6T 1W5, Canada
[6] Fred Hutchinson Canc Res Ctr, Vaccine & Infect Dis Div, Seattle, WA 98104 USA
来源
PLOS ONE | 2011年 / 6卷 / 02期
基金
美国国家卫生研究院;
关键词
LEUKEMIA INHIBITORY FACTOR; BREAST-CANCER; EM ALGORITHM; DNA; MOTIFS; IDENTIFICATION; ASSOCIATION; ACTIVATION; DISCOVERY; ALIGNMENT;
D O I
10.1371/journal.pone.0016432
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
ChIP-Seq has become the standard method for genome-wide profiling DNA association of transcription factors. To simplify analyzing and interpreting ChIP-Seq data, which typically involves using multiple applications, we describe an integrated, open source, R-based analysis pipeline. The pipeline addresses data input, peak detection, sequence and motif analysis, visualization, and data export, and can readily be extended via other R and Bioconductor packages. Using a standard multicore computer, it can be used with datasets consisting of tens of thousands of enriched regions. We demonstrate its effectiveness on published human ChIP-Seq datasets for FOXA1, ER, CTCF and STAT1, where it detected co-occurring motifs that were consistent with the literature but not detected by other methods. Our pipeline provides the first complete set of Bioconductor tools for sequence and motif analysis of ChIP-Seq and ChIP-chip data.
引用
收藏
页数:11
相关论文
共 70 条
[1]   Global networks of functional coupling in eukaryotes from comprehensive data integration [J].
Alexeyenko, Andrey ;
Sonnhammer, Erik L. L. .
GENOME RESEARCH, 2009, 19 (06) :1107-1116
[2]  
[Anonymous], 2001, BIOINFORMATICS
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   MEME: discovering and analyzing DNA and protein sequence motifs [J].
Bailey, Timothy L. ;
Williams, Nadya ;
Misleh, Chris ;
Li, Wilfred W. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W369-W373
[5]   Combining evidence using p-values: application to sequence homology searches [J].
Bailey, TL ;
Gribskov, M .
BIOINFORMATICS, 1998, 14 (01) :48-54
[6]  
Bailey TL., 1994, Proc Int Conf Intel Syst Mol Biol, V2, P28
[7]   Sole-Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data [J].
Blahnik, Kimberly R. ;
Dou, Lei ;
O'Geen, Henriette ;
McPhillips, Timothy ;
Xu, Xiaoqin ;
Cao, Alina R. ;
Iyengar, Sushma ;
Nicolet, Charles M. ;
Ludaescher, Bertram ;
Korf, Ian ;
Farnham, Peggy J. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (03) :e13.1-e13.17
[8]   De novo motif identification improves the accuracy of predicting transcription factor binding sites in ChIP-Seq data analysis [J].
Boeva, Valentina ;
Surdez, Didier ;
Guillon, Noelle ;
Tirode, Franck ;
Fejes, Anthony P. ;
Delattre, Olivier ;
Barillot, Emmanuel .
NUCLEIC ACIDS RESEARCH, 2010, 38 (11) :e126-e126
[9]   CHARACTERIZATION OF A PATHWAY FOR CILIARY NEUROTROPHIC FACTOR SIGNALING TO THE NUCLEUS [J].
BONNI, A ;
FRANK, DA ;
SCHINDLER, C ;
GREENBERG, ME .
SCIENCE, 1993, 262 (5139) :1575-1579
[10]   Functional architecture and evolution of transcriptional elements that drive gene coexpression [J].
Brown, Christopher D. ;
Johnson, David S. ;
Sidow, Arend .
SCIENCE, 2007, 317 (5844) :1557-1560