Fast Computation and Applications of Genome Mappability

被引:331
作者
Derrien, Thomas [1 ]
Estelle, Jordi [2 ]
Marco Sola, Santiago [2 ]
Knowles, David G. [3 ]
Raineri, Emanuele [2 ]
Guigo, Roderic [3 ]
Ribeca, Paolo [2 ]
机构
[1] Univ Rennes 1, Inst Genet & Dev IGDR, Rennes, France
[2] CNAG, Barcelona, Spain
[3] Univ Pompeu Fabra, CRG, Barcelona, Spain
来源
PLOS ONE | 2012年 / 7卷 / 01期
关键词
RNA-SEQ; SEGMENTAL DUPLICATIONS; EVOLUTION; ELEMENTS; STRATEGY;
D O I
10.1371/journal.pone.0030377
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We present a fast mapping-based algorithm to compute the mappability of each region of a reference genome up to a specified number of mismatches. Knowing the mappability of a genome is crucial for the interpretation of massively parallel sequencing experiments. We investigate the properties of the mappability of eukaryotic DNA/RNA both as a whole and at the level of the gene family, providing for various organisms tracks which allow the mappability information to be visually explored. In addition, we show that mappability varies greatly between species and gene classes. Finally, we suggest several practical applications where mappability can be used to refine the analysis of high-throughput sequencing data (SNP calling, gene expression quantification and paired-end experiments). This work highlights mappability as an important concept which deserves to be taken into full account, in particular when massively parallel sequencing technologies are employed. The GEM mappability program belongs to the GEM (GEnome Multitool) suite of programs, which can be freely downloaded for any use from its website (http://gemlibrary.sourceforge.net).
引用
收藏
页数:16
相关论文
共 35 条
  • [1] The eukaryotic genome as an RNA machine
    Amaral, Paulo P.
    Dinger, Marcel E.
    Mercer, Tim R.
    Mattick, John S.
    [J]. SCIENCE, 2008, 319 (5871) : 1787 - 1789
  • [2] Recent segmental duplications in the human genome
    Bailey, JA
    Gu, ZP
    Clark, RA
    Reinert, K
    Samonte, RV
    Schwartz, S
    Adams, MD
    Myers, EW
    Li, PW
    Eichler, EE
    [J]. SCIENCE, 2002, 297 (5583) : 1003 - 1007
  • [3] Primate segmental duplications: crucibles of evolution, diversity and disease
    Bailey, Jeffrey A.
    Eichler, Evan E.
    [J]. NATURE REVIEWS GENETICS, 2006, 7 (07) : 552 - 564
  • [4] Burrows M, 1994, ALGORITHM DATA COMPR, P18, DOI 10.1.1.37.6774
  • [5] The impact of retrotransposons on human genome evolution
    Cordaux, Richard
    Batzer, Mark A.
    [J]. NATURE REVIEWS GENETICS, 2009, 10 (10) : 691 - 703
  • [6] Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants
    Du, Jiang
    Bjornson, Robert D.
    Zhang, Zhengdong D.
    Kong, Yong
    Snyder, Michael
    Gerstein, Mark B.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (07)
  • [7] A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE
    Faulkner, Geoffrey J.
    Forrest, Alistair R. R.
    Chalk, Alistair M.
    Schroder, Kate
    Hayashizaki, Yoshihide
    Carninci, Piero
    Hume, David A.
    Grimmond, Sean M.
    [J]. GENOMICS, 2008, 91 (03) : 281 - 288
  • [8] Opinion - Transposable elements and the evolution of regulatory networks
    Feschotte, Cedric
    [J]. NATURE REVIEWS GENETICS, 2008, 9 (05) : 397 - 405
  • [9] Copy number variation: New insights in genome diversity
    Freeman, Jennifer L.
    Perry, George H.
    Feuk, Lars
    Redon, Richard
    McCarroll, Steven A.
    Altshuler, David M.
    Aburatani, Hiroyuki
    Jones, Keith W.
    Tyler-Smith, Chris
    Hurles, Matthew E.
    Carter, Nigel P.
    Scherer, Stephen W.
    Lee, Charles
    [J]. GENOME RESEARCH, 2006, 16 (08) : 949 - 961
  • [10] GENCODE: producing a reference annotation for ENCODE
    Harrow, Jennifer
    Denoeud, France
    Frankish, Adam
    Reymond, Alexandre
    Chen, Chao-Kung
    Chrast, Jacqueline
    Lagarde, Julien
    Gilbert, James Gr
    Storey, Roy
    Swarbreck, David
    Rossier, Colette
    Ucla, Catherine
    Hubbard, Tim
    Antonarakis, Stylianos E.
    Guigo, Roderic
    [J]. GENOME BIOLOGY, 2006, 7 (Suppl 1)