A geometric approach for classification and comparison of structural variants

被引:115
作者
Sindi, Suzanne [1 ,2 ]
Helman, Elena [3 ]
Bashir, Ali [4 ]
Raphael, Benjamin J. [1 ,3 ]
机构
[1] Brown Univ, Ctr Computat Mol Biol, Providence, RI 02912 USA
[2] Brown Univ, Div Appl Math, Providence, RI 02912 USA
[3] Univ Calif San Diego, Dept Comp Sci, San Diego, CA 92103 USA
[4] Univ Calif San Diego, Bioinformat Grad Program, San Diego, CA 92103 USA
关键词
COPY-NUMBER POLYMORPHISM; COMPARATIVE GENOMIC HYBRIDIZATION; FINE-SCALE; ARRAY CGH; REARRANGEMENTS; ARCHITECTURE;
D O I
10.1093/bioinformatics/btp208
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Structural variants, including duplications, insertions, deletions and inversions of large blocks of DNA sequence, are an important contributor to human genome variation. Measuring structural variants in a genome sequence is typically more challenging than measuring single nucleotide changes. Current approaches for structural variant identification, including paired-end DNA sequencing/mapping and array comparative genomic hybridization (aCGH), do not identify the boundaries of variants precisely. Consequently, most reported human structural variants are poorly defined and not readily compared across different studies and measurement techniques. Results: We introduce Geometric Analysis of Structural Variants (GASV), a geometric approach for identification, classification and comparison of structural variants. This approach represents the uncertainty in measurement of a structural variant as a polygon in the plane, and identifies measurements supporting the same variant by computing intersections of polygons. We derive a computational geometry algorithm to efficiently identify all such intersections. We apply GASV to sequencing data from nine individual human genomes and several cancer genomes. We obtain better localization of the boundaries of structural variants, distinguish genetic from putative somatic structural variants in cancer genomes, and integrate aCGH and paired-end sequencing measurements of structural variants. This work presents the first general framework for comparing structural variants across multiple samples and measurement techniques, and will be useful for studies of both genetic structural variants and somatic rearrangements in cancer.
引用
收藏
页码:I222 / I230
页数:9
相关论文
共 36 条
  • [1] AERNI S, 2009, COMBINED ANAL COPY N
  • [2] Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer
    Bashir, Ali
    Volik, Stanislav
    Collins, Colin
    Bafna, Vineet
    Raphael, Benjamin J.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (04)
  • [3] Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing
    Campbell, Peter J.
    Stephens, Philip J.
    Pleasance, Erin D.
    O'Meara, Sarah
    Li, Heng
    Santarius, Thomas
    Stebbings, Lucy A.
    Leroy, Catherine
    Edkins, Sarah
    Hardy, Claire
    Teague, Jon W.
    Menzies, Andrew
    Goodhead, Ian
    Turner, Daniel J.
    Clee, Christopher M.
    Quail, Michael A.
    Cox, Antony
    Brown, Clive
    Durbin, Richard
    Hurles, Matthew E.
    Edwards, Paul A. W.
    Bignell, Graham R.
    Stratton, Michael R.
    Futreal, P. Andrew
    [J]. NATURE GENETICS, 2008, 40 (06) : 722 - 729
  • [4] AN OPTIMAL ALGORITHM FOR INTERSECTING LINE SEGMENTS IN THE PLANE
    CHAZELLE, B
    EDELSBRUNNER, H
    [J]. JOURNAL OF THE ACM, 1992, 39 (01) : 1 - 54
  • [5] A high-resolution survey of deletion polymorphism in the human genome
    Conrad, DF
    Andrews, TD
    Carter, NP
    Hurles, ME
    Pritchard, JK
    [J]. NATURE GENETICS, 2006, 38 (01) : 75 - 81
  • [6] Systematic assessment of copy number variant detection via genome-wide SNP genotyping
    Cooper, Gregory M.
    Zerr, Troy
    Kidd, Jeffrey M.
    Eichler, Evan E.
    Nickerson, Deborah A.
    [J]. NATURE GENETICS, 2008, 40 (10) : 1199 - 1203
  • [7] A portrait of copy-number polymorphism in Drosophila melanogaster
    Dopman, Erik B.
    Hartl, Daniel L.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (50) : 19920 - 19925
  • [8] Recurrent DNA copy number variation in the laboratory mouse
    Egan, Chris M.
    Sridhar, Srinath
    Wigler, Michael
    Hall, Ira M.
    [J]. NATURE GENETICS, 2007, 39 (11) : 1384 - 1389
  • [9] Natural selection shapes genome-wide patterns of copy-number polymorphism in Drosophila melanogaster
    Emerson, J. J.
    Cardoso-Moreira, Margarida
    Borevitz, Justin O.
    Long, Manyuan
    [J]. SCIENCE, 2008, 320 (5883) : 1629 - 1631
  • [10] Hidden Markov models approach to the analysis of array CGH data
    Fridlyand, J
    Snijders, AM
    Pinkel, D
    Albertson, DG
    Jain, AN
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) : 132 - 153