Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads

被引:1464
作者
Ye, Kai [1 ,2 ,3 ,4 ]
Schulz, Marcel H. [1 ,5 ,6 ]
Long, Quan [7 ]
Apweiler, Rolf [1 ]
Ning, Zemin [7 ]
机构
[1] EMBL Outstat European Bioinformat Inst, Cambridge, England
[2] Leiden Univ, Med Ctr, Dept Mol Epidemiol, Leiden, Netherlands
[3] Leiden Univ, Dept Med Stat, Med Ctr, NL-2300 RA Leiden, Netherlands
[4] Leiden Univ, Dept Bioinformat, Med Ctr, NL-2300 RA Leiden, Netherlands
[5] Max Planck Inst Mol Genet, Berlin, Germany
[6] Int Max Planck Res Sch Computat Biol & Sci Comp, Berlin, Germany
[7] Wellcome Trust Sanger Inst, Cambridge, England
关键词
D O I
10.1093/bioinformatics/btp394
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There is a strong demand in the genomic community to develop effective algorithms to reliably identify genomic variants. Indel detection using next-gen data is difficult and identification of long structural variations is extremely challenging. Results: We present Pindel, a pattern growth approach, to detect breakpoints of large deletions and medium-sized insertions from paired-end short reads. We use both simulated reads and real data to demonstrate the efficiency of the computer program and accuracy of the results.
引用
收藏
页码:2865 / 2871
页数:7
相关论文
共 14 条
  • [1] Natural genetic variation caused by transposable elements in humans
    Bennettt, EA
    Coleman, LE
    Tsui, C
    Pittard, WS
    Devine, SE
    [J]. GENETICS, 2004, 168 (02) : 933 - 951
  • [2] Accurate whole human genome sequencing using reversible terminator chemistry
    Bentley, David R.
    Balasubramanian, Shankar
    Swerdlow, Harold P.
    Smith, Geoffrey P.
    Milton, John
    Brown, Clive G.
    Hall, Kevin P.
    Evers, Dirk J.
    Barnes, Colin L.
    Bignell, Helen R.
    Boutell, Jonathan M.
    Bryant, Jason
    Carter, Richard J.
    Cheetham, R. Keira
    Cox, Anthony J.
    Ellis, Darren J.
    Flatbush, Michael R.
    Gormley, Niall A.
    Humphray, Sean J.
    Irving, Leslie J.
    Karbelashvili, Mirian S.
    Kirk, Scott M.
    Li, Heng
    Liu, Xiaohai
    Maisinger, Klaus S.
    Murray, Lisa J.
    Obradovic, Bojan
    Ost, Tobias
    Parkinson, Michael L.
    Pratt, Mark R.
    Rasolonjatovo, Isabelle M. J.
    Reed, Mark T.
    Rigatti, Roberto
    Rodighiero, Chiara
    Ross, Mark T.
    Sabot, Andrea
    Sankar, Subramanian V.
    Scally, Aylwyn
    Schroth, Gary P.
    Smith, Mark E.
    Smith, Vincent P.
    Spiridou, Anastassia
    Torrance, Peta E.
    Tzonev, Svilen S.
    Vermaas, Eric H.
    Walter, Klaudia
    Wu, Xiaolin
    Zhang, Lu
    Alam, Mohammed D.
    Anastasi, Carole
    [J]. NATURE, 2008, 456 (7218) : 53 - 59
  • [3] Short read fragment assembly of bacterial genomes
    Chaisson, Mark J.
    Pevzner, Pavel A.
    [J]. GENOME RESEARCH, 2008, 18 (02) : 324 - 330
  • [4] Detection of large-scale variation in the human genome
    Iafrate, AJ
    Feuk, L
    Rivera, MN
    Listewnik, ML
    Donahoe, PK
    Qi, Y
    Scherer, SW
    Lee, C
    [J]. NATURE GENETICS, 2004, 36 (09) : 949 - 951
  • [5] Mapping and sequencing of structural variation from eight human genomes (Reprinted from Nature, vol 453, pg 56-64, 2008)
    Kidd, Jeffrey M.
    Cooper, Gregory M.
    Donahue, William F.
    Hayden, Hillary S.
    Sampas, Nick
    Graves, Tina
    Hansen, Nancy
    Teague, Brian
    Alkan, Can
    Antonacci, Francesca
    Haugen, Eric
    Zerr, Troy
    Yamada, N. Alice
    Tsang, Peter
    Newman, Tera L.
    Tuzun, Eray
    Cheng, Ze
    Ebling, Heather M.
    Tusneem, Nadeem
    David, Robert
    Gillett, Will
    Phelps, Karen A.
    Weaver, Molly
    Saranga, David
    Brand, Adrianne
    Tao, Wei
    Gustafson, Erik
    McKernan, Kevin
    Chen, Lin
    Malig, Maika
    Smith, Joshua D.
    Korn, Joshua M.
    McCarroll, Steven A.
    Altshuler, David A.
    Peiffer, Daniel A.
    Dorschner, Michael
    Stamatoyannopoulos, John
    Schwartz, David
    Nickerson, Deborah A.
    Mullikin, James C.
    Wilson, Richard K.
    Bruhn, Laurakay
    Olson, Maynard V.
    Kaul, Rajinder
    Smith, Douglas R.
    Eichler, Evan E.
    [J]. NATURE GENETICS, 2009, : S22 - S30
  • [6] The diploid genome sequence of an individual human
    Levy, Samuel
    Sutton, Granger
    Ng, Pauline C.
    Feuk, Lars
    Halpern, Aaron L.
    Walenz, Brian P.
    Axelrod, Nelson
    Huang, Jiaqi
    Kirkness, Ewen F.
    Denisov, Gennady
    Lin, Yuan
    MacDonald, Jeffrey R.
    Pang, Andy Wing Chun
    Shago, Mary
    Stockwell, Timothy B.
    Tsiamouri, Alexia
    Bafna, Vineet
    Bansal, Vikas
    Kravitz, Saul A.
    Busam, Dana A.
    Beeson, Karen Y.
    Mclntosh, Tina C.
    Remington, Karin A.
    Abril, Josep F.
    Gill, John
    Borman, Jon
    Rogers, Yu-Hui
    Frazier, Marvin E.
    Scherer, Stephen W.
    Strausberg, Robert L.
    Venter, J. Craig
    [J]. PLOS BIOLOGY, 2007, 5 (10) : 2113 - 2144
  • [7] An initial map of insertion and deletion (INDEL) variation in the human genome
    Mills, Ryan E.
    Luttig, Christopher T.
    Larkins, Christine E.
    Beauchamp, Adam
    Tsui, Circe
    Pittard, W. Stephen
    Devine, Scott E.
    [J]. GENOME RESEARCH, 2006, 16 (09) : 1182 - 1190
  • [8] SSAHA: A fast search method for large DNA databases
    Ning, ZM
    Cox, AJ
    Mullikin, JC
    [J]. GENOME RESEARCH, 2001, 11 (10) : 1725 - 1729
  • [9] Mining sequential patterns by pattern-growth: The PrefixSpan approach
    Pei, J
    Han, JW
    Mortazavi-Asl, B
    Wang, JY
    Pinto, H
    Chen, QM
    Dayal, U
    Hsu, MC
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (11) : 1424 - 1440
  • [10] Schulz Marcel H, 2008, Int J Bioinform Res Appl, V4, P81, DOI 10.1504/IJBRA.2008.017165