False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions

被引:55
作者
Pickrell, Joseph K. [1 ]
Gaffney, Daniel J. [1 ,2 ]
Gilad, Yoav [1 ]
Pritchard, Jonathan K. [1 ,2 ]
机构
[1] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
[2] Univ Chicago, Howard Hughes Med Inst, Chicago, IL 60637 USA
基金
美国国家卫生研究院;
关键词
PROTEIN-DNA INTERACTIONS; HUMAN GENOME; IN-VIVO; CHROMATIN;
D O I
10.1093/bioinformatics/btr354
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Sequencing-based assays such as ChIP-seq, DNase-seq and MNase-seq have become important tools for genome annotation. In these assays, short sequence reads enriched for loci of interest are mapped to a reference genome to determine their origin. Here, we consider whether false positive peak calls can be caused by particular type of error in the reference genome: multicopy sequences which have been incorrectly assembled and collapsed into a single copy. Results: Using sequencing data from the 1000 Genomes Project, we systematically scanned the human genome for regions of high sequencing depth. These regions are highly enriched for erroneously inferred transcription factor binding sites, positions of nucleosomes and regions of open chromatin. We suggest a simple masking procedure to remove these regions and reduce false positive calls.
引用
收藏
页码:2144 / 2146
页数:3
相关论文
共 15 条
[1]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[2]   Recent segmental duplications in the human genome [J].
Bailey, JA ;
Gu, ZP ;
Clark, RA ;
Reinert, K ;
Samonte, RV ;
Schwartz, S ;
Adams, MD ;
Myers, EW ;
Li, PW ;
Eichler, EE .
SCIENCE, 2002, 297 (5583) :1003-1007
[3]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[4]   High-resolution mapping and characterization of open chromatin across the genome [J].
Boyle, Alan P. ;
Davis, Sean ;
Shulha, Hennady P. ;
Meltzer, Paul ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Furey, Terrence S. ;
Crawford, Gregory E. .
CELL, 2008, 132 (02) :311-322
[5]   Origins and functional impact of copy number variation in the human genome [J].
Conrad, Donald F. ;
Pinto, Dalila ;
Redon, Richard ;
Feuk, Lars ;
Gokcumen, Omer ;
Zhang, Yujun ;
Aerts, Jan ;
Andrews, T. Daniel ;
Barnes, Chris ;
Campbell, Peter ;
Fitzgerald, Tomas ;
Hu, Min ;
Ihm, Chun Hwa ;
Kristiansson, Kati ;
MacArthur, Daniel G. ;
MacDonald, Jeffrey R. ;
Onyiah, Ifejinelo ;
Pang, Andy Wing Chun ;
Robson, Sam ;
Stirrups, Kathy ;
Valsesia, Armand ;
Walter, Klaudia ;
Wei, John ;
Tyler-Smith, Chris ;
Carter, Nigel P. ;
Lee, Charles ;
Scherer, Stephen W. ;
Hurles, Matthew E. .
NATURE, 2010, 464 (7289) :704-712
[6]  
Hesselberth JR, 2009, NAT METHODS, V6, P283, DOI [10.1038/NMETH.1313, 10.1038/nmeth.1313]
[7]   Genome-wide mapping of in vivo protein-DNA interactions [J].
Johnson, David S. ;
Mortazavi, Ali ;
Myers, Richard M. ;
Wold, Barbara .
SCIENCE, 2007, 316 (5830) :1497-1502
[8]   Comprehensive analysis of the chromatin landscape in Drosophila melanogaster [J].
Kharchenko, Peter V. ;
Alekseyenko, Artyom A. ;
Schwartz, Yuri B. ;
Minoda, Aki ;
Riddle, Nicole C. ;
Ernst, Jason ;
Sabo, Peter J. ;
Larschan, Erica ;
Gorchakov, Andrey A. ;
Gu, Tingting ;
Linder-Basso, Daniela ;
Plachetka, Annette ;
Shanower, Gregory ;
Tolstorukov, Michael Y. ;
Luquette, Lovelace J. ;
Xi, Ruibin ;
Jung, Youngsook L. ;
Park, Richard W. ;
Bishop, Eric P. ;
Canfield, Theresa K. ;
Sandstrom, Richard ;
Thurman, Robert E. ;
MacAlpine, David M. ;
Stamatoyannopoulos, John A. ;
Kellis, Manolis ;
Elgin, Sarah C. R. ;
Kuroda, Mitzi I. ;
Pirrotta, Vincenzo ;
Karpen, Gary H. ;
Park, Peter J. .
NATURE, 2011, 471 (7339) :480-+
[9]   The uniqueome: a mappability resource for short-tag sequencing [J].
Koehler, Ryan ;
Issac, Hadar ;
Cloonan, Nicole ;
Grimmond, Sean M. .
BIOINFORMATICS, 2011, 27 (02) :272-274
[10]   Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data [J].
Pique-Regi, Roger ;
Degner, Jacob F. ;
Pai, Athma A. ;
Gaffney, Daniel J. ;
Gilad, Yoav ;
Pritchard, Jonathan K. .
GENOME RESEARCH, 2011, 21 (03) :447-455