An effective approach for identification of in vivo protein-DNA binding sites from paired-end ChIP-Seq data

被引:15
作者
Wang, Congmao [1 ]
Xu, Jie [1 ]
Zhang, Dasheng [1 ]
Wilson, Zoe A. [2 ]
Zhang, Dabing [1 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Life Sci & Biotechnol, Shanghai 200240, Peoples R China
[2] Univ Nottingham, Sch Biosci, Loughborough LE12 5RD, Leics, England
[3] Shanghai Jiao Tong Univ, Minist Educ, Bio X Res Ctr, Key Lab Genet & Dev & Neuropsychiat Dis, Shanghai 200240, Peoples R China
来源
BMC BIOINFORMATICS | 2010年 / 11卷
基金
中国国家自然科学基金;
关键词
GENOME-WIDE IDENTIFICATION; CHROMATIN IMMUNOPRECIPITATION; TRANSCRIPTION FACTOR;
D O I
10.1186/1471-2105-11-81
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: ChIP-Seq, which combines chromatin immunoprecipitation (ChIP) with high-throughput massively parallel sequencing, is increasingly being used for identification of protein-DNA interactions in vivo in the genome. However, to maximize the effectiveness of data analysis of such sequences requires the development of new algorithms that are able to accurately predict DNA-protein binding sites. Results: Here, we present SIPeS (Site Identification from Paired-end Sequencing), a novel algorithm for precise identification of binding sites from short reads generated by paired-end solexa ChIP-Seq technology. In this paper we used ChIP-Seq data from the Arabidopsis basic helix-loop-helix transcription factor ABORTED MICROSPORES (AMS), which is expressed within the anther during pollen development, the results show that SIPeS has better resolution for binding site identification compared to two existing ChIP-Seq peak detection algorithms, Cisgenome and MACS. Conclusions: When compared to Cisgenome and MACS, SIPeS shows better resolution for binding site discovery. Moreover, SIPeS is designed to calculate the mappable genome length accurately with the fragment length based on the paired-end reads. Dynamic baselines are also employed to effectively discriminate closely adjacent binding sites, for effective binding sites discovery, which is of particular value when working with high-density genomes.
引用
收藏
页数:8
相关论文
共 21 条
[1]   High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[2]   F-Seq: a feature density estimator for high-throughput sequence tags [J].
Boyle, Alan P. ;
Guinney, Justin ;
Crawford, Gregory E. ;
Furey, Terrence S. .
BIOINFORMATICS, 2008, 24 (21) :2537-2538
[3]   FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology [J].
Fejes, Anthony P. ;
Robertson, Gordon ;
Bilenky, Mikhail ;
Varhol, Richard ;
Bainbridge, Matthew ;
Jones, Steven J. M. .
BIOINFORMATICS, 2008, 24 (15) :1729-1730
[4]   Genome-wide identification of DNA-protein interactions using chromatin immunoprecipitation coupled with flow cell sequencing [J].
Hoffman, Brad G. ;
Jones, Steven J. M. .
JOURNAL OF ENDOCRINOLOGY, 2009, 201 (01) :1-13
[5]   Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF [J].
Iyer, VR ;
Horak, CE ;
Scafe, CS ;
Botstein, D ;
Snyder, M ;
Brown, PO .
NATURE, 2001, 409 (6819) :533-538
[6]   An integrated software system for analyzing ChIP-chip and ChIP-seq data [J].
Ji, Hongkai ;
Jiang, Hui ;
Ma, Wenxiu ;
Johnson, David S. ;
Myers, Richard M. ;
Wong, Wing H. .
NATURE BIOTECHNOLOGY, 2008, 26 (11) :1293-1300
[7]   Genome-wide mapping of in vivo protein-DNA interactions [J].
Johnson, David S. ;
Mortazavi, Ali ;
Myers, Richard M. ;
Wold, Barbara .
SCIENCE, 2007, 316 (5830) :1497-1502
[8]   Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data [J].
Jothi, Raja ;
Cuddapah, Suresh ;
Barski, Artem ;
Cui, Kairong ;
Zhao, Keji .
NUCLEIC ACIDS RESEARCH, 2008, 36 (16) :5221-5231
[9]   Design and analysis of ChIP-seq experiments for DNA-binding proteins [J].
Kharchenko, Peter V. ;
Tolstorukov, Michael Y. ;
Park, Peter J. .
NATURE BIOTECHNOLOGY, 2008, 26 (12) :1351-1359
[10]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)