PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls

被引:379
作者
Rozowsky, Joel [1 ]
Euskirchen, Ghia [2 ]
Auerbach, Raymond K. [3 ]
Zhang, Zhengdong D. [1 ]
Gibson, Theodore [1 ]
Bjornson, Robert [4 ]
Carriero, Nicholas [4 ]
Snyder, Michael [1 ,2 ]
Gerstein, Mark B. [1 ,3 ,4 ]
机构
[1] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[2] Yale Univ, Dept Mol Cellular & Dev Biol, New Haven, CT 06520 USA
[3] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[4] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
基金
美国国家卫生研究院;
关键词
FACTOR-BINDING-SITES; FALSE DISCOVERY RATE; HUMAN GENOME; CHROMATIN IMMUNOPRECIPITATION; STRUCTURAL VARIATION; ELEMENTS;
D O I
10.1038/nbt.1518
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Chromatin immunoprecipitation (ChIP) followed by tag sequencing (ChIP-seq) using high-throughput next-generation instrumentation is fast, replacing chromatin immunoprecipitation followed by genome tiling array analysis (ChIP-chip) as the preferred approach for mapping of sites of transcription-factor binding and chromatin modification. Using two deeply sequenced data sets for human RNA polymerase II and STAT1, each with matching input-DNA controls, we describe a general scoring approach to address unique challenges in ChIP-seq data analysis. Our approach is based on the observation that sites of potential binding are strongly correlated with signal peaks in the control, likely revealing features of open chromatin. We develop a two-pass strategy called PeakSeq to compensate for this. A two-pass strategy compensates for signal caused by open chromatin, as revealed by inclusion of the controls. The first pass identifies putative binding sites and compensates for genomic variation in the 'mappability' of sequences. The second pass filters out sites not significantly enriched compared to the normalized control, computing precise enrichments and significances. Our scoring procedure enables us to optimize experimental design by estimating the depth of sequencing required for a desired level of coverage and demonstrating that more than two replicates provides only a marginal gain in information.
引用
收藏
页码:66 / 75
页数:10
相关论文
共 24 条
[1]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[2]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[3]   Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs [J].
Cawley, S ;
Bekiranov, S ;
Ng, HH ;
Kapranov, P ;
Sekinger, EA ;
Kampa, D ;
Piccolboni, A ;
Sementchenko, V ;
Cheng, J ;
Williams, AJ ;
Wheeler, R ;
Wong, B ;
Drenkow, J ;
Yamanaka, M ;
Patel, S ;
Brubaker, S ;
Tammana, H ;
Helt, G ;
Struhl, K ;
Gingeras, TR .
CELL, 2004, 116 (04) :499-509
[4]   Mapping of transcription factor binding regions in mammalian cells by ChIP: Comparison of array- and sequencing-based technologies [J].
Euskirchen, Ghia M. ;
Rozowsky, Joel S. ;
Wei, Chia-Lin ;
Lee, Wah Heng ;
Zhang, Zhengdong D. ;
Hartman, Stephen ;
Emanuelsson, Olof ;
Stolc, Viktor ;
Weissman, Sherman ;
Gerstein, Mark B. ;
Ruan, Yijun ;
Snyder, Michael .
GENOME RESEARCH, 2007, 17 (06) :898-909
[5]   Chipper: discovering transcription-factor targets from chromatin immunoprecipitation microarrays using variance stabilization [J].
Gibbons, FD ;
Proft, M ;
Struhl, K ;
Roth, FP .
GENOME BIOLOGY, 2005, 6 (11)
[6]   FAIRE ((F)under-barormaldehyde-(A)under-barssisted (I)under-barsolation of (R)under-baregulatory (E)under-barlements) isolates active regulatory elements from human chromatin [J].
Giresi, Paul G. ;
Kim, Jonghwan ;
McDaniell, Ryan M. ;
Iyer, Vishwanath R. ;
Lieb, Jason D. .
GENOME RESEARCH, 2007, 17 (06) :877-885
[7]   ChIP-chip: A genomic approach for identifying transcription factor binding sites [J].
Horak, CE ;
Snyder, M .
GUIDE TO YEAST GENETICS AND MOLECULAR AND CELL BIOLOGY, PT B, 2002, 350 :469-483
[8]   Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF [J].
Iyer, VR ;
Horak, CE ;
Scafe, CS ;
Botstein, D ;
Snyder, M ;
Brown, PO .
NATURE, 2001, 409 (6819) :533-538
[9]   Genome-wide mapping of in vivo protein-DNA interactions [J].
Johnson, David S. ;
Mortazavi, Ali ;
Myers, Richard M. ;
Wold, Barbara .
SCIENCE, 2007, 316 (5830) :1497-1502
[10]   The human genome browser at UCSC [J].
Kent, WJ ;
Sugnet, CW ;
Furey, TS ;
Roskin, KM ;
Pringle, TH ;
Zahler, AM ;
Haussler, D .
GENOME RESEARCH, 2002, 12 (06) :996-1006