A Gibbs sampling strategy applied to the mapping of ambiguous short-sequence tags

被引:32
作者
Wang, Jianrong [1 ]
Huda, Ahsan [1 ]
Lunyak, Victoria V. [2 ]
Jordan, I. King [1 ]
机构
[1] Georgia Inst Technol, Sch Biol, Atlanta, GA 30332 USA
[2] Buck Inst Age Res, Novato, CA 94945 USA
基金
美国国家卫生研究院;
关键词
HUMAN GENOME; RESOLUTION; ALIGNMENT; BROWSER; UCSC;
D O I
10.1093/bioinformatics/btq460
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Chromatin immunoprecipitation followed by high-throughput sequencing ( ChIP-seq) is widely used in biological research. ChIP-seq experiments yield many ambiguous tags that can be mapped with equal probability to multiple genomic sites. Such ambiguous tags are typically eliminated from consideration resulting in a potential loss of important biological information. Results: We have developed a Gibbs sampling-based algorithm for the genomic mapping of ambiguous sequence tags. Our algorithm relies on the local genomic tag context to guide the mapping of ambiguous tags. The Gibbs sampling procedure we use simultaneously maps ambiguous tags and updates the probabilities used to infer correct tag map positions. We show that our algorithm is able to correctly map more ambiguous tags than existing mapping methods. Our approach is also able to uncover mapped genomic sites from highly repetitive sequences that can not be detected based on unique tags alone, including transposable elements, segmental duplications and peri-centromeric regions. This mapping approach should prove to be useful for increasing biological knowledge on the too often neglected repetitive genomic regions.
引用
收藏
页码:2501 / 2508
页数:8
相关论文
共 14 条
[1]   High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[2]   Computational epigenetics [J].
Bock, Christoph ;
Lengauer, Thomas .
BIOINFORMATICS, 2008, 24 (01) :1-10
[3]   A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE [J].
Faulkner, Geoffrey J. ;
Forrest, Alistair R. R. ;
Chalk, Alistair M. ;
Schroder, Kate ;
Hayashizaki, Yoshihide ;
Carninci, Piero ;
Hume, David A. ;
Grimmond, Sean M. .
GENOMICS, 2008, 91 (03) :281-288
[4]   Opinion - Transposable elements and the evolution of regulatory networks [J].
Feschotte, Cedric .
NATURE REVIEWS GENETICS, 2008, 9 (05) :397-405
[5]   Probabilistic resolution of multi-mapping reads in massively parallel sequencing data using MuMRescueLite [J].
Hashimoto, Takehiro ;
de Hoon, Michiel J. L. ;
Grimmond, Sean M. ;
Daub, Carsten O. ;
Hayashizaki, Yoshihide ;
Faulkner, Geoffrey J. .
BIOINFORMATICS, 2009, 25 (19) :2613-2614
[6]   The UCSC Table Browser data retrieval tool [J].
Karolchik, D ;
Hinrichs, AS ;
Furey, TS ;
Roskin, KM ;
Sugnet, CW ;
Haussler, D ;
Kent, WJ .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D493-D496
[7]   The human genome browser at UCSC [J].
Kent, WJ ;
Sugnet, CW ;
Furey, TS ;
Roskin, KM ;
Pringle, TH ;
Zahler, AM ;
Haussler, D .
GENOME RESEARCH, 2002, 12 (06) :996-1006
[8]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[9]   DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT [J].
LAWRENCE, CE ;
ALTSCHUL, SF ;
BOGUSKI, MS ;
LIU, JS ;
NEUWALD, AF ;
WOOTTON, JC .
SCIENCE, 1993, 262 (5131) :208-214
[10]   Mapping short DNA sequencing reads and calling variants using mapping quality scores [J].
Li, Heng ;
Ruan, Jue ;
Durbin, Richard .
GENOME RESEARCH, 2008, 18 (11) :1851-1858