A robust framework for detecting structural variations in a genome

被引:42
作者
Lee, Seunghak [1 ]
Cheran, Elango [1 ]
Brudno, Michael [1 ,2 ,3 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3G4, Canada
[2] Univ Toronto, Banting & Best Dept Med Res, Toronto, ON M5S 3G4, Canada
[3] Univ Toronto, Ctr Anal Genome Evolut & Funct, Toronto, ON M5S 3G4, Canada
关键词
D O I
10.1093/bioinformatics/btn176
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recently, structural genomic variants have come to the forefront as a significant source of variation in the human population, but the identification of these variants in a large genome remains a challenge. The complete sequencing of a human individual is prohibitive at current costs, while current polymorphism detection technologies, such as SNP arrays, are not able to identify many of the large scale events. One of the most promising methods to detect such variants is the computational mapping of clone-end sequences to a reference genome. Results: Here, we present a probabilistic framework for the identification of structural variants using clone-end sequencing. Unlike previous methods, our approach does not rely on an a priori determined mapping of all reads to the reference. Instead, we build a framework for finding the most probable assignment of sequenced clones to potential structural variants based on the other clones. We compare our predictions with the structural variants identified in three previous studies. While there is a statistically significant correlation between the predictions, we also find a significant number of previously uncharacterized structural variants. Furthermore, we identify a number of putative cross-chromosomal events, primarily located proximally to the centromeres of the chromosomes.
引用
收藏
页码:I59 / I67
页数:9
相关论文
共 19 条
  • [1] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [2] Jumping translocations
    Berger, Roland
    Bernard, Olivier A.
    [J]. GENES CHROMOSOMES & CANCER, 2007, 46 (08) : 717 - 723
  • [3] Inducing features of random fields
    DellaPietra, S
    DellaPietra, V
    Lafferty, J
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (04) : 380 - 393
  • [4] Structural variation in the human genome
    Feuk, L
    Carson, AR
    Scherer, SW
    [J]. NATURE REVIEWS GENETICS, 2006, 7 (02) : 85 - 97
  • [5] A METHOD FOR COMPARING 2 HIERARCHICAL CLUSTERINGS
    FOWLKES, EB
    MALLOWS, CL
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1983, 78 (383) : 553 - 569
  • [6] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3
  • [7] Detection of large-scale variation in the human genome
    Iafrate, AJ
    Feuk, L
    Rivera, MN
    Listewnik, ML
    Donahoe, PK
    Qi, Y
    Scherer, SW
    Lee, C
    [J]. NATURE GENETICS, 2004, 36 (09) : 949 - 951
  • [8] Sequences flanking the centromere of human chromosome 10 are a complex patchwork of arm-specific sequences, stable duplications and unstable sequences with homologies to telomeric and other centromeric locations
    Jackson, MS
    Rocchi, M
    Thompson, G
    Hearn, T
    Crosier, M
    Guy, J
    Kirk, D
    Mulligan, L
    Ricco, A
    Piccininni, S
    Marzella, R
    Viggiano, L
    Archidiacono, N
    [J]. HUMAN MOLECULAR GENETICS, 1999, 8 (02) : 205 - 215
  • [9] COMPARATIVE GENOMIC HYBRIDIZATION FOR MOLECULAR CYTOGENETIC ANALYSIS OF SOLID TUMORS
    KALLIONIEMI, A
    KALLIONIEMI, OP
    SUDAR, D
    RUTOVITZ, D
    GRAY, JW
    WALDMAN, F
    PINKEL, D
    [J]. SCIENCE, 1992, 258 (5083) : 818 - 821
  • [10] The human genome browser at UCSC
    Kent, WJ
    Sugnet, CW
    Furey, TS
    Roskin, KM
    Pringle, TH
    Zahler, AM
    Haussler, D
    [J]. GENOME RESEARCH, 2002, 12 (06) : 996 - 1006