Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer

被引:57
作者
Bashir, Ali [1 ]
Volik, Stanislav [2 ]
Collins, Colin [2 ]
Bafna, Vineet [3 ]
Raphael, Benjamin J. [4 ]
机构
[1] Univ Calif San Diego, Bioinformat Grad Program, San Diego, CA 92103 USA
[2] Univ Calif San Francisco, Ctr Comprehens Canc, San Francisco, CA 94143 USA
[3] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA USA
[4] Brown Univ, Dept Comp Sci, Ctr Computat Mol Biol, Providence, RI 02912 USA
关键词
D O I
10.1371/journal.pcbi.1000051
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Paired-end sequencing is emerging as a key technique for assessing genome rearrangements and structural variation on a genome-wide scale. This technique is particularly useful for detecting copy-neutral rearrangements, such as inversions and translocations, which are common in cancer and can produce novel fusion genes. We address the question of how much sequencing is required to detect rearrangement breakpoints and to localize them precisely using both theoretical models and simulation. We derive a formula for the probability that a fusion gene exists in a cancer genome given a collection of paired-end sequences from this genome. We use this formula to compute fusion gene probabilities in several breast cancer samples, and we find that we are able to accurately predict fusion genes in these samples with a relatively small number of fragments of large size. We further demonstrate how the ability to detect fusion genes depends on the distribution of gene lengths, and we evaluate how different parameters of a sequencing strategy impact breakpoint detection, breakpoint localization, and fusion gene detection, even in the presence of errors that suggest false rearrangements. These results will be useful in calibrating future cancer sequencing efforts, particularly large-scale studies of many cancer genomes that are enabled by next-generation sequencing technologies.
引用
收藏
页数:14
相关论文
共 36 条
[11]   The UCSC Genome Browser Database [J].
Karolchik, D ;
Baertsch, R ;
Diekhans, M ;
Furey, TS ;
Hinrichs, A ;
Lu, YT ;
Roskin, KM ;
Schwartz, M ;
Sugnet, CW ;
Thomas, DJ ;
Weber, RJ ;
Haussler, D ;
Kent, WJ .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :51-54
[12]   ChimerDB - a knowledgebase for fusion sequences [J].
Kim, Namshin ;
Kim, Pora ;
Nam, Seungyoon ;
Shin, Seokmin ;
Lee, Sanghyuk .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D21-D24
[13]  
Korbel JO, 2007, SCIENCE, V318, P420, DOI 10.1126/science.1149504
[14]   THE MOLECULAR PATHOLOGY OF CHRONIC MYELOGENOUS LEUKEMIA [J].
KURZROCK, R ;
TALPAZ, M .
BRITISH JOURNAL OF HAEMATOLOGY, 1991, 79 :34-37
[15]  
LANDER E S, 1988, Genomics, V2, P231
[16]   A Novel Approach for Determining Cancer Genomic Breakpoints in the Presence of Normal DNA [J].
Liu, Yu-Tsueng ;
Carson, Dennis A. .
PLOS ONE, 2007, 2 (04)
[17]   The protein kinase complement of the human genome [J].
Manning, G ;
Whyte, DB ;
Martinez, R ;
Hunter, T ;
Sudarsanam, S .
SCIENCE, 2002, 298 (5600) :1912-+
[18]   EWING SARCOMA 11-22 TRANSLOCATION PRODUCES A CHIMERIC TRANSCRIPTION FACTOR THAT REQUIRES THE DNA-BINDING DOMAIN ENCODED BY FLI1 FOR TRANSFORMATION [J].
MAY, WA ;
GISHIZKY, ML ;
LESSNICK, SL ;
LUNSFORD, LB ;
LEWIS, BC ;
DELATTRE, O ;
ZUCMAN, J ;
THOMAS, G ;
DENNY, CT .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (12) :5752-5756
[19]   Spliced MLL fusions:: a novel mechanism to generate functional chimeric MLL-MLLT1 transcripts in t(11;19)(23; p13.3) leukemia [J].
Meyer, C. ;
Burmeister, T. ;
Strehl, S. ;
Schneider, B. ;
Hubert, D. ;
Zach, O. ;
Haas, O. ;
Klingebiel, T. ;
Dingermann, T. ;
Marschalek, R. .
LEUKEMIA, 2007, 21 (03) :588-590
[20]   Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer [J].
Mitelman, F ;
Johansson, B ;
Mertens, F .
NATURE GENETICS, 2004, 36 (04) :331-334