Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data

被引:146
作者
Piper, Jason [1 ,2 ]
Elze, Markus C. [1 ,3 ]
Cauchy, Pierre [2 ]
Cockerill, Peter N. [4 ]
Bonifer, Constanze [2 ]
Ott, Sascha [1 ]
机构
[1] Univ Warwick, Warwick Syst Biol Ctr, Coventry CV4 7AL, W Midlands, England
[2] Univ Birmingham, Coll Med & Dent Sci, Inst Biomed Res, Sch Canc Sci, Birmingham B15 2TT, W Midlands, England
[3] Univ Warwick, Dept Stat, Coventry CV4 7AL, W Midlands, England
[4] Univ Birmingham, Coll Med & Dent Sci, Inst Biomed Res, Sch Immun & Infect, Birmingham B15 2TT, W Midlands, England
基金
英国工程与自然科学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
TRANSCRIPTION FACTORS; IN-VIVO; SYSTEMATIC DISCOVERY; REGULATORY ELEMENTS; BINDING; SITES; RESOLUTION; CHROMATIN; ENHANCER; CTCF;
D O I
10.1093/nar/gkt850
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The expression of eukaryotic genes is regulated by cis-regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins. One of the great challenges in the gene regulation field is to characterise these elements. This involves the identification of transcription factor (TF) binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase and the subsequent analysis of regions protected from cleavage (DNase footprinting) has for many years been used to identify specific binding sites occupied by TFs at individual cis-elements with high resolution. This methodology has recently been adapted for high-throughput sequencing (DNase-seq). In this study, we describe an imbalance in the DNA strand-specific alignment information of DNase-seq data surrounding protein-DNA interactions that allows accurate prediction of occupied TF binding sites. Our study introduces a novel algorithm, Wellington, which considers the imbalance in this strand-specific information to efficiently identify DNA footprints. This algorithm significantly enhances specificity by reducing the proportion of false positives and requires significantly fewer predictions than previously reported methods to recapitulate an equal amount of ChIP-seq data. We also provide an open-source software package, pyDNase, which implements the Wellington algorithm to interface with DNase-seq data and expedite analyses.
引用
收藏
页数:12
相关论文
共 33 条
[1]   High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[2]   Runx1 binds as a dimeric complex to overlapping Runx1 sites within a palindromic element in the human GM-CSF enhancer [J].
Bowers, Sarion R. ;
Calero-Nieto, Fernando J. ;
Valeaux, Stephanie ;
Fernandez-Fuentes, Narcis ;
Cockerill, Peter N. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (18) :6124-+
[3]   A Conserved Insulator That Recruits CTCF and Cohesin Exists between the Closely Related but Divergently Regulated Interleukin-3 and Granulocyte-Macrophage Colony-Stimulating Factor Genes [J].
Bowers, Sarion R. ;
Mirabella, Fabio ;
Calero-Nieto, Fernando J. ;
Valeaux, Stephanie ;
Hadjur, Suzana ;
Baxter, Euan W. ;
Merkenschlager, Matthias ;
Cockerill, Peter N. .
MOLECULAR AND CELLULAR BIOLOGY, 2009, 29 (07) :1682-1693
[4]   High-resolution mapping and characterization of open chromatin across the genome [J].
Boyle, Alan P. ;
Davis, Sean ;
Shulha, Hennady P. ;
Meltzer, Paul ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Furey, Terrence S. ;
Crawford, Gregory E. .
CELL, 2008, 132 (02) :311-322
[5]   High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells [J].
Boyle, Alan P. ;
Song, Lingyun ;
Lee, Bum-Kyu ;
London, Darin ;
Keefe, Damian ;
Birney, Ewan ;
Iyer, Vishwanath R. ;
Crawford, Gregory E. ;
Furey, Terrence S. .
GENOME RESEARCH, 2011, 21 (03) :456-464
[6]   Structure and function of active chromatin and DNase I hypersensitive sites [J].
Cockerill, Peter N. .
FEBS JOURNAL, 2011, 278 (13) :2182-2210
[7]  
David WHosmer., 2013, Applied Logistic Regression, VThird
[8]   Structural and functional characterization of the human FMR1 promoter reveals similarities with the hnRNP-A2 promoter region [J].
Drouin, R ;
Angers, M ;
Dallaire, N ;
Rose, TM ;
Khandjian, EW ;
Rousseau, F .
HUMAN MOLECULAR GENETICS, 1997, 6 (12) :2051-2060
[9]   An integrated encyclopedia of DNA elements in the human genome [J].
Dunham, Ian ;
Kundaje, Anshul ;
Aldred, Shelley F. ;
Collins, Patrick J. ;
Davis, CarrieA. ;
Doyle, Francis ;
Epstein, Charles B. ;
Frietze, Seth ;
Harrow, Jennifer ;
Kaul, Rajinder ;
Khatun, Jainab ;
Lajoie, Bryan R. ;
Landt, Stephen G. ;
Lee, Bum-Kyu ;
Pauli, Florencia ;
Rosenbloom, Kate R. ;
Sabo, Peter ;
Safi, Alexias ;
Sanyal, Amartya ;
Shoresh, Noam ;
Simon, Jeremy M. ;
Song, Lingyun ;
Trinklein, Nathan D. ;
Altshuler, Robert C. ;
Birney, Ewan ;
Brown, James B. ;
Cheng, Chao ;
Djebali, Sarah ;
Dong, Xianjun ;
Dunham, Ian ;
Ernst, Jason ;
Furey, Terrence S. ;
Gerstein, Mark ;
Giardine, Belinda ;
Greven, Melissa ;
Hardison, Ross C. ;
Harris, Robert S. ;
Herrero, Javier ;
Hoffman, Michael M. ;
Iyer, Sowmya ;
Kellis, Manolis ;
Khatun, Jainab ;
Kheradpour, Pouya ;
Kundaje, Anshul ;
Lassmann, Timo ;
Li, Qunhua ;
Lin, Xinying ;
Marinov, Georgi K. ;
Merkel, Angelika ;
Mortazavi, Ali .
NATURE, 2012, 489 (7414) :57-74
[10]  
Elnitski L, 1997, J BIOL CHEM, V272, P369