PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data

被引:173
作者
Korbel, Jan O. [1 ,2 ,3 ]
Abyzov, Alexej [3 ]
Mu, Xinmeng Jasmine [4 ]
Carriero, Nicholas [5 ]
Cayting, Philip [3 ]
Zhang, Zhengdong [3 ]
Snyder, Michael [3 ,4 ]
Gerstein, Mark B. [3 ,4 ,5 ,6 ]
机构
[1] European Mol Biol Lab, Gene Express Unit, D-69117 Heidelberg, Germany
[2] EMBL EBI, EMBL Outstat Hinxton, Cambridge CB10 1SA, England
[3] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[4] Yale Univ, Dept Mol Cellular & Dev Biol, New Haven, CT 06520 USA
[5] Yale Univ, Dept Comp Sci, New Haven, CT 06511 USA
[6] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
来源
GENOME BIOLOGY | 2009年 / 10卷 / 02期
关键词
COPY-NUMBER VARIATION; FINE-SCALE; RESOLUTION; IDENTIFICATION; TOOL; POLYMORPHISM; ELEMENTS; SNPS; MAP;
D O I
10.1186/gb-2009-10-2-r23
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Personal-genomics endeavors, such as the 1000 Genomes project, are generating maps of genomic structural variants by analyzing ends of massively sequenced genome fragments. To process these we developed Paired-End Mapper (PEMer; http://sv.gersteinlab.org/pemer). This comprises an analysis pipeline, compatible with several next-generation sequencing platforms; simulation-based error models, yielding confidence-values for each structural variant; and a back-end database. The simulations demonstrated high structural variant reconstruction efficiency for PEMer's coverage-adjusted multi-cutoff scoring-strategy and showed its relative insensitivity to base-calling errors.
引用
收藏
页数:14
相关论文
共 47 条
  • [11] A high-resolution survey of deletion polymorphism in the human genome
    Conrad, DF
    Andrews, TD
    Carter, NP
    Hurles, ME
    Pritchard, JK
    [J]. NATURE GENETICS, 2006, 38 (01) : 75 - 81
  • [12] Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
    Dohm, Juliane C.
    Lottaz, Claudio
    Borodina, Tatiana
    Himmelbauer, Heinz
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 (16)
  • [13] Structural variation in the human genome
    Feuk, L
    Carson, AR
    Scherer, SW
    [J]. NATURE REVIEWS GENETICS, 2006, 7 (02) : 85 - 97
  • [14] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3
  • [15] The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility
    Gonzalez, E
    Kulkarni, H
    Bolivar, H
    Mangano, A
    Sanchez, R
    Catano, G
    Nibbs, RJ
    Freedman, BI
    Quinones, MP
    Bamshad, MJ
    Murthy, KK
    Rovin, BH
    Bradley, W
    Clark, RA
    Anderson, SA
    O'Connell, RJ
    Agan, BK
    Ahuja, SS
    Bologna, R
    Sen, L
    Dolan, MJ
    Ahuja, SK
    [J]. SCIENCE, 2005, 307 (5714) : 1434 - 1440
  • [16] Common deletions and SNPs are in linkage disequilibrium in the human genome
    Hinds, DA
    Kloek, AP
    Jen, M
    Chen, XY
    Frazer, KA
    [J]. NATURE GENETICS, 2006, 38 (01) : 82 - 85
  • [17] Detection of large-scale variation in the human genome
    Iafrate, AJ
    Feuk, L
    Rivera, MN
    Listewnik, ML
    Donahoe, PK
    Qi, Y
    Scherer, SW
    Lee, C
    [J]. NATURE GENETICS, 2004, 36 (09) : 949 - 951
  • [18] Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202. Article published online before March 2002, 10.1101/gr.229202]
  • [19] Genome assembly comparison identifies structural variants in the human genome
    Khaja, Razi
    Zhang, Junjun
    MacDonald, Jeffrey R.
    He, Yongshu
    Joseph-George, Ann M.
    Wei, John
    Rafiq, Muhammad A.
    Qian, Cheng
    Shago, Mary
    Pantano, Lorena
    Aburatani, Hiroyuki
    Jones, Keith
    Redon, Richard
    Hurles, Matthew
    Armengol, Lluis
    Estivill, Xavier
    Mural, Richard J.
    Lee, Charles
    Scherer, Stephen W.
    Feuk, Lars
    [J]. NATURE GENETICS, 2006, 38 (12) : 1413 - 1418
  • [20] Mapping and sequencing of structural variation from eight human genomes (Reprinted from Nature, vol 453, pg 56-64, 2008)
    Kidd, Jeffrey M.
    Cooper, Gregory M.
    Donahue, William F.
    Hayden, Hillary S.
    Sampas, Nick
    Graves, Tina
    Hansen, Nancy
    Teague, Brian
    Alkan, Can
    Antonacci, Francesca
    Haugen, Eric
    Zerr, Troy
    Yamada, N. Alice
    Tsang, Peter
    Newman, Tera L.
    Tuzun, Eray
    Cheng, Ze
    Ebling, Heather M.
    Tusneem, Nadeem
    David, Robert
    Gillett, Will
    Phelps, Karen A.
    Weaver, Molly
    Saranga, David
    Brand, Adrianne
    Tao, Wei
    Gustafson, Erik
    McKernan, Kevin
    Chen, Lin
    Malig, Maika
    Smith, Joshua D.
    Korn, Joshua M.
    McCarroll, Steven A.
    Altshuler, David A.
    Peiffer, Daniel A.
    Dorschner, Michael
    Stamatoyannopoulos, John
    Schwartz, David
    Nickerson, Deborah A.
    Mullikin, James C.
    Wilson, Richard K.
    Bruhn, Laurakay
    Olson, Maynard V.
    Kaul, Rajinder
    Smith, Douglas R.
    Eichler, Evan E.
    [J]. NATURE GENETICS, 2009, : S22 - S30