Fast and sensitive mapping of bisulfite-treated sequencing data

被引:38
作者
Otto, Christian [1 ,2 ]
Stadler, Peter F. [1 ,2 ,3 ,4 ,5 ,6 ]
Hoffmann, Steve [1 ,2 ]
机构
[1] Univ Leipzig, Dept Comp Sci, Interdisciplinary Ctr Bioinformat & Bioinformat G, D-04107 Leipzig, Germany
[2] Univ Leipzig, Transcriptome Bioinformat Grp, LIFE Leipzig Res Ctr Civilizat Dis, D-04107 Leipzig, Germany
[3] Fraunhofer Inst Cell Therapy & Immunol, RNom Grp, D-04103 Leipzig, Germany
[4] Santa Fe Inst, Santa Fe, NM 87501 USA
[5] Univ Vienna, Dept Theoret Chem, A-1090 Vienna, Austria
[6] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
关键词
DNA METHYLATION; GENOME; PATTERNS; CHORDATE; REVEALS; WIDE;
D O I
10.1093/bioinformatics/bts254
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cytosine DNA methylation is one of the major epigenetic modifications and influences gene expression, developmental processes, X-chromosome inactivation, and genomic imprinting. Aberrant methylation is furthermore known to be associated with several diseases including cancer. The gold standard to determine DNA methylation on genome-wide scales is 'bisulfite sequencing': DNA fragments are treated with sodium bisulfite resulting in the conversion of unmethylated cytosines into uracils, whereas methylated cytosines remain unchanged. The resulting sequencing reads thus exhibit asymmetric bisulfite-related mismatches and suffer from an effective reduction of the alphabet size in the unmethylated regions, rendering the mapping of bisulfite sequencing reads computationally much more demanding. As a consequence, currently available read mapping software often fails to achieve high sensitivity and in many cases requires unrealistic computational resources to cope with large real-life datasets. Results: In this study, we present a seed-based approach based on enhanced suffix arrays in conjunction with Myers bit-vector algorithm to efficiently extend seeds to optimal semi-global alignments while allowing for bisulfite-related substitutions. It outperforms most current approaches in terms of sensitivity and performs time-competitive in mapping hundreds of millions of sequencing reads to vertebrate genomes.
引用
收藏
页码:1698 / 1704
页数:7
相关论文
共 36 条
[1]  
Abouelhoda M. I., 2004, Journal of Discrete Algorithms, V2, P53, DOI 10.1016/S1570-8667(03)00065-0
[2]   Patterns of damage in genomic DNA sequences from a Neandertal [J].
Briggs, Adrian W. ;
Stenzel, Udo ;
Johnson, Philip L. F. ;
Green, Richard E. ;
Kelso, Janet ;
Pruefer, Kay ;
Meyer, Matthias ;
Krause, Johannes ;
Ronan, Michael T. ;
Lachmann, Michael ;
Paeaebo, Svante .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (37) :14616-14621
[3]   Linking DNA methylation and histone modification: patterns and paradigms [J].
Cedar, Howard ;
Bergman, Yehudit .
NATURE REVIEWS GENETICS, 2009, 10 (05) :295-304
[4]   BS Seeker: precise mapping for bisulfite sequencing [J].
Chen, Pao-Yang ;
Cokus, Shawn J. ;
Pellegrini, Matteo .
BMC BIOINFORMATICS, 2010, 11
[5]   Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning [J].
Cokus, Shawn J. ;
Feng, Suhua ;
Zhang, Xiaoyu ;
Chen, Zugen ;
Merriman, Barry ;
Haudenschild, Christian D. ;
Pradhan, Sriharsa ;
Nelson, Stanley F. ;
Pellegrini, Matteo ;
Jacobsen, Steven E. .
NATURE, 2008, 452 (7184) :215-219
[6]   The draft genome of Ciona intestinalis:: Insights into chordate and vertebrate origins [J].
Dehal, P ;
Satou, Y ;
Campbell, RK ;
Chapman, J ;
Degnan, B ;
De Tomaso, A ;
Davidson, B ;
Di Gregorio, A ;
Gelpke, M ;
Goodstein, DM ;
Harafuji, N ;
Hastings, KEM ;
Ho, I ;
Hotta, K ;
Huang, W ;
Kawashima, T ;
Lemaire, P ;
Martinez, D ;
Meinertzhagen, IA ;
Necula, S ;
Nonaka, M ;
Putnam, N ;
Rash, S ;
Saiga, H ;
Satake, M ;
Terry, A ;
Yamada, L ;
Wang, HG ;
Awazu, S ;
Azumi, K ;
Boore, J ;
Branno, M ;
Chin-bow, S ;
DeSantis, R ;
Doyle, S ;
Francino, P ;
Keys, DN ;
Haga, S ;
Hayashi, H ;
Hino, K ;
Imai, KS ;
Inaba, K ;
Kano, S ;
Kobayashi, K ;
Kobayashi, M ;
Lee, BI ;
Makabe, KW ;
Manohar, C ;
Matassi, G ;
Medina, M .
SCIENCE, 2002, 298 (5601) :2157-2167
[7]  
Esteller M., 2005, DNA Methylation: Approaches, Methods and Applications
[8]   Cancer epigenomics: DNA methylomes and histone-modification maps [J].
Esteller, Manel .
NATURE REVIEWS GENETICS, 2007, 8 (04) :286-298
[9]   A GENOMIC SEQUENCING PROTOCOL THAT YIELDS A POSITIVE DISPLAY OF 5-METHYLCYTOSINE RESIDUES IN INDIVIDUAL DNA STRANDS [J].
FROMMER, M ;
MCDONALD, LE ;
MILLAR, DS ;
COLLIS, CM ;
WATT, F ;
GRIGG, GW ;
MOLLOY, PL ;
PAUL, CL .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (05) :1827-1831
[10]   Extensive Demethylation of Repetitive Elements During Seed Development Underlies Gene Imprinting [J].
Gehring, Mary ;
Bubb, Kerry L. ;
Henikoff, Steven .
SCIENCE, 2009, 324 (5933) :1447-1451