PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data

被引:173
作者
Korbel, Jan O. [1 ,2 ,3 ]
Abyzov, Alexej [3 ]
Mu, Xinmeng Jasmine [4 ]
Carriero, Nicholas [5 ]
Cayting, Philip [3 ]
Zhang, Zhengdong [3 ]
Snyder, Michael [3 ,4 ]
Gerstein, Mark B. [3 ,4 ,5 ,6 ]
机构
[1] European Mol Biol Lab, Gene Express Unit, D-69117 Heidelberg, Germany
[2] EMBL EBI, EMBL Outstat Hinxton, Cambridge CB10 1SA, England
[3] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[4] Yale Univ, Dept Mol Cellular & Dev Biol, New Haven, CT 06520 USA
[5] Yale Univ, Dept Comp Sci, New Haven, CT 06511 USA
[6] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
来源
GENOME BIOLOGY | 2009年 / 10卷 / 02期
关键词
COPY-NUMBER VARIATION; FINE-SCALE; RESOLUTION; IDENTIFICATION; TOOL; POLYMORPHISM; ELEMENTS; SNPS; MAP;
D O I
10.1186/gb-2009-10-2-r23
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Personal-genomics endeavors, such as the 1000 Genomes project, are generating maps of genomic structural variants by analyzing ends of massively sequenced genome fragments. To process these we developed Paired-End Mapper (PEMer; http://sv.gersteinlab.org/pemer). This comprises an analysis pipeline, compatible with several next-generation sequencing platforms; simulation-based error models, yielding confidence-values for each structural variant; and a back-end database. The simulations demonstrated high structural variant reconstruction efficiency for PEMer's coverage-adjusted multi-cutoff scoring-strategy and showed its relative insensitivity to base-calling errors.
引用
收藏
页数:14
相关论文
共 47 条
  • [1] *1000 GEN, 1000 GEN PROJ
  • [2] Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans
    Aitman, TJ
    Dong, R
    Vyse, TJ
    Norsworthy, PJ
    Johnson, MD
    Smith, J
    Mangion, J
    Roberton-Lowe, C
    Marshall, AJ
    Petretto, E
    Hodges, MD
    Bhangal, G
    Patel, SG
    Sheehan-Rooney, K
    Duda, M
    Cook, PR
    Evans, DJ
    Domin, J
    Flint, J
    Boyle, JJ
    Pusey, CD
    Cook, HT
    [J]. NATURE, 2006, 439 (7078) : 851 - 855
  • [3] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [4] Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer
    Bashir, Ali
    Volik, Stanislav
    Collins, Colin
    Bafna, Vineet
    Raphael, Benjamin J.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (04)
  • [5] Accurate whole human genome sequencing using reversible terminator chemistry
    Bentley, David R.
    Balasubramanian, Shankar
    Swerdlow, Harold P.
    Smith, Geoffrey P.
    Milton, John
    Brown, Clive G.
    Hall, Kevin P.
    Evers, Dirk J.
    Barnes, Colin L.
    Bignell, Helen R.
    Boutell, Jonathan M.
    Bryant, Jason
    Carter, Richard J.
    Cheetham, R. Keira
    Cox, Anthony J.
    Ellis, Darren J.
    Flatbush, Michael R.
    Gormley, Niall A.
    Humphray, Sean J.
    Irving, Leslie J.
    Karbelashvili, Mirian S.
    Kirk, Scott M.
    Li, Heng
    Liu, Xiaohai
    Maisinger, Klaus S.
    Murray, Lisa J.
    Obradovic, Bojan
    Ost, Tobias
    Parkinson, Michael L.
    Pratt, Mark R.
    Rasolonjatovo, Isabelle M. J.
    Reed, Mark T.
    Rigatti, Roberto
    Rodighiero, Chiara
    Ross, Mark T.
    Sabot, Andrea
    Sankar, Subramanian V.
    Scally, Aylwyn
    Schroth, Gary P.
    Smith, Mark E.
    Smith, Vincent P.
    Spiridou, Anastassia
    Torrance, Peta E.
    Tzonev, Svilen S.
    Vermaas, Eric H.
    Walter, Klaudia
    Wu, Xiaolin
    Zhang, Lu
    Alam, Mohammed D.
    Anastasi, Carole
    [J]. NATURE, 2008, 456 (7218) : 53 - 59
  • [6] Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution
    Bignell, Graham R.
    Santarius, Thomas
    Pole, Jessica C. M.
    Butler, Adam P.
    Perry, Janet
    Pleasance, Erin
    Greenman, Chris
    Menzies, Andrew
    Taylor, Sheila
    Edkins, Sarah
    Campbell, Peter
    Quail, Michael
    Plumb, Bob
    Matthews, Lucy
    Mclay, Kirsten
    Edwards, Paul A. W.
    Rogers, Jane
    Wooster, Richard
    Futreal, P. Andrew
    Stratton, Michael R.
    [J]. GENOME RESEARCH, 2007, 17 (09) : 1296 - 1303
  • [7] Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
    Birney, Ewan
    Stamatoyannopoulos, John A.
    Dutta, Anindya
    Guigo, Roderic
    Gingeras, Thomas R.
    Margulies, Elliott H.
    Weng, Zhiping
    Snyder, Michael
    Dermitzakis, Emmanouil T.
    Stamatoyannopoulos, John A.
    Thurman, Robert E.
    Kuehn, Michael S.
    Taylor, Christopher M.
    Neph, Shane
    Koch, Christoph M.
    Asthana, Saurabh
    Malhotra, Ankit
    Adzhubei, Ivan
    Greenbaum, Jason A.
    Andrews, Robert M.
    Flicek, Paul
    Boyle, Patrick J.
    Cao, Hua
    Carter, Nigel P.
    Clelland, Gayle K.
    Davis, Sean
    Day, Nathan
    Dhami, Pawandeep
    Dillon, Shane C.
    Dorschner, Michael O.
    Fiegler, Heike
    Giresi, Paul G.
    Goldy, Jeff
    Hawrylycz, Michael
    Haydock, Andrew
    Humbert, Richard
    James, Keith D.
    Johnson, Brett E.
    Johnson, Ericka M.
    Frum, Tristan T.
    Rosenzweig, Elizabeth R.
    Karnani, Neerja
    Lee, Kirsten
    Lefebvre, Gregory C.
    Navas, Patrick A.
    Neri, Fidencio
    Parker, Stephen C. J.
    Sabo, Peter J.
    Sandstrom, Richard
    Shafer, Anthony
    [J]. NATURE, 2007, 447 (7146) : 799 - 816
  • [8] Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing
    Campbell, Peter J.
    Stephens, Philip J.
    Pleasance, Erin D.
    O'Meara, Sarah
    Li, Heng
    Santarius, Thomas
    Stebbings, Lucy A.
    Leroy, Catherine
    Edkins, Sarah
    Hardy, Claire
    Teague, Jon W.
    Menzies, Andrew
    Goodhead, Ian
    Turner, Daniel J.
    Clee, Christopher M.
    Quail, Michael A.
    Cox, Antony
    Brown, Clive
    Durbin, Richard
    Hurles, Matthew E.
    Edwards, Paul A. W.
    Bignell, Graham R.
    Stratton, Michael R.
    Futreal, P. Andrew
    [J]. NATURE GENETICS, 2008, 40 (06) : 722 - 729
  • [9] Scanning the human genome at kilobase resolution
    Chen, Jun
    Kim, Yeong C.
    Jung, Yong-Chul
    Xuan, Zhenyu
    Dworkin, Geoff
    Zhang, Yanming
    Zhang, Michael Q.
    Wang, San Ming
    [J]. GENOME RESEARCH, 2008, 18 (05) : 751 - 762
  • [10] PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
    Chiu, Kuo Ping
    Wong, Chee-Hong
    Chen, Qiongyu
    Ariyaratne, Pramila
    Ooi, Hong Sain
    Wei, Chia-Lin
    Sung, Wing-Kin Ken
    Ruan, Yijun
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)