Qualimap: evaluating next-generation sequencing alignment data

被引:780
作者
Garcia-Alcalde, Fernando [1 ,2 ]
Okonechnikov, Konstantin [2 ]
Carbonell, Jose [1 ]
Cruz, Luis M. [1 ]
Goetz, Stefan [1 ]
Tarazona, Sonia [1 ]
Dopazo, Joaquin [1 ]
Meyer, Thomas F. [2 ]
Conesa, Ana [1 ]
机构
[1] Ctr Invest Principe Felipe, Bioinformat & Genom Dept, Valencia 46012, Spain
[2] Max Planck Inst Infect Biol, Dept Mol Biol, D-10117 Berlin, Germany
关键词
RNA-SEQ; READS;
D O I
10.1093/bioinformatics/bts503
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Motivation: The sequence alignment/map (SAM) and the binary alignment/map (BAM) formats have become the standard method of representation of nucleotide sequence alignments for next-generation sequencing data. SAM/BAM files usually contain information from tens to hundreds of millions of reads. Often, the sequencing technology, protocol and/or the selected mapping algorithm introduce some unwanted biases in these data. The systematic detection of such biases is a non-trivial task that is crucial to drive appropriate downstream analyses. Results: We have developed Qualimap, a Java application that supports user-friendly quality control of mapping data, by considering sequence features and their genomic properties. Qualimap takes sequence alignment data and provides graphical and statistical analyses for the evaluation of data. Such quality-control data are vital for highlighting problems in the sequencing and/or mapping processes, which must be addressed prior to further analyses.
引用
收藏
页码:2678 / 2679
页数:2
相关论文
共 12 条
[1]
RNA-SeQC: RNA-seq metrics for quality control and process optimization [J].
DeLuca, David S. ;
Levin, Joshua Z. ;
Sivachenko, Andrey ;
Fennell, Timothy ;
Nazaire, Marc-Danie ;
Williams, Chris ;
Reich, Michael ;
Winckler, Wendy ;
Getz, Gad .
BIOINFORMATICS, 2012, 28 (11) :1530-1532
[2]
Ensembl 2011 [J].
Flicek, Paul ;
Amode, M. Ridwan ;
Barrell, Daniel ;
Beal, Kathryn ;
Brent, Simon ;
Chen, Yuan ;
Clapham, Peter ;
Coates, Guy ;
Fairley, Susan ;
Fitzgerald, Stephen ;
Gordon, Leo ;
Hendrix, Maurice ;
Hourlier, Thibaut ;
Johnson, Nathan ;
Kaehaeri, Andreas ;
Keefe, Damian ;
Keenan, Stephen ;
Kinsella, Rhoda ;
Kokocinski, Felix ;
Kulesha, Eugene ;
Larsson, Pontus ;
Longden, Ian ;
McLaren, William ;
Overduin, Bert ;
Pritchard, Bethan ;
Riat, Harpreet Singh ;
Rios, Daniel ;
Ritchie, Graham R. S. ;
Ruffier, Magali ;
Schuster, Michael ;
Sobral, Daniel ;
Spudich, Giulietta ;
Tang, Y. Amy ;
Trevanion, Stephen ;
Vandrovcova, Jana ;
Vilella, Albert J. ;
White, Simon ;
Wilder, Steven P. ;
Zadissa, Amonida ;
Zamora, Jorge ;
Aken, Bronwen L. ;
Birney, Ewan ;
Cunningham, Fiona ;
Dunham, Ian ;
Durbin, Richard ;
Fernandez-Suarez, Xose M. ;
Herrero, Javier ;
Hubbard, Tim J. P. ;
Parker, Anne ;
Proctor, Glenn .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D800-D806
[3]
Flicek P, 2009, NAT METHODS, V6, pS6, DOI [10.1038/NMETH.1376, 10.1038/nmeth.1376]
[4]
Evaluation of next generation sequencing platforms for population targeted sequencing studies [J].
Harismendy, Olivier ;
Ng, Pauline C. ;
Strausberg, Robert L. ;
Wang, Xiaoyun ;
Stockwell, Timothy B. ;
Beeson, Karen Y. ;
Schork, Nicholas J. ;
Murray, Sarah S. ;
Topol, Eric J. ;
Levy, Samuel ;
Frazer, Kelly A. .
GENOME BIOLOGY, 2009, 10 (03)
[5]
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[6]
SAMStat: monitoring biases in next generation sequencing data [J].
Lassmann, Timo ;
Hayashizaki, Yoshihide ;
Daub, Carsten O. .
BIOINFORMATICS, 2011, 27 (01) :130-131
[7]
Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[8]
RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays [J].
Marioni, John C. ;
Mason, Christopher E. ;
Mane, Shrikant M. ;
Stephens, Matthew ;
Gilad, Yoav .
GENOME RESEARCH, 2008, 18 (09) :1509-1517
[9]
APPLICATIONS OF NEXT-GENERATION SEQUENCING Sequencing technologies - the next generation [J].
Metzker, Michael L. .
NATURE REVIEWS GENETICS, 2010, 11 (01) :31-46
[10]
Repitools: an R package for the analysis of enrichment-based epigenomic data [J].
Statham, Aaron L. ;
Strbenac, Dario ;
Coolen, Marcel W. ;
Stirzaker, Clare ;
Clark, Susan J. ;
Robinson, Mark D. .
BIOINFORMATICS, 2010, 26 (13) :1662-1663