De novo discovery of mutated driver pathways in cancer

被引:322
作者
Vandin, Fabio
Upfal, Eli
Raphael, Benjamin J. [1 ]
机构
[1] Brown Univ, Dept Comp Sci, Providence, RI 02912 USA
关键词
MARKOV-CHAINS; MUTATIONS; GENE; ALGORITHMS; PTEN; EGFR;
D O I
10.1101/gr.120477.111
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation DNA sequencing technologies are enabling genome-wide measurements of somatic mutations in large numbers of cancer patients. A major challenge in the interpretation of these data is to distinguish functional "driver mutations" important for cancer development from random "passenger mutations." A common approach for identifying driver mutations is to find genes that are mutated at significant frequency in a large cohort of cancer genomes. This approach is confounded by the observation that driver mutations target multiple cellular signaling and regulatory pathways. Thus, each cancer patient may exhibit a different combination of mutations that are sufficient to perturb these pathways. This mutational heterogeneity presents a problem for predicting driver mutations solely from their frequency of occurrence. We introduce two combinatorial properties, coverage and exclusivity, that distinguish driver pathways, or groups of genes containing driver mutations, from groups of genes with passenger mutations. We derive two algorithms, called Dendrix, to find driver pathways de novo from somatic mutation data. We apply Dendrix to analyze somatic mutation data from 623 genes in 188 lung adenocarcinoma patients, 601 genes in 84 glioblastoma patients, and 238 known mutations in 1000 patients with various cancers. In all data sets, we find groups of genes that are mutated in large subsets of patients and whose mutations are approximately exclusive. Our Dendrix algorithms scale to whole-genome analysis of thousands of patients and thus will prove useful for larger data sets to come from The Cancer Genome Atlas (TCGA) and other large-scale cancer genome sequencing projects.
引用
收藏
页码:375 / 385
页数:11
相关论文
共 51 条
[1]  
[Anonymous], 1990, COMPUT INTRACTABILIT
[2]  
Bäcklund LM, 2003, CLIN CANCER RES, V9, P4151
[3]   An MCMC algorithm for haplotype assembly from whole-genome sequence data [J].
Bansal, Vikas ;
Halpern, Aaron L. ;
Axelrod, Nelson ;
Bafna, Vineet .
GENOME RESEARCH, 2008, 18 (08) :1336-1346
[4]   Discovering local structure in gene expression data: The order-preserving submatrix problem [J].
Ben-Dor, A ;
Chor, B ;
Karp, R ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (3-4) :373-384
[5]   Patient-oriented gene set analysis for cancer mutation data [J].
Boca, Simina M. ;
Kinzler, Kenneth W. ;
Velculescu, Victor E. ;
Vogelstein, Bert ;
Parmigiani, Giovanni .
GENOME BIOLOGY, 2010, 11 (11)
[6]   Testing for Mutual Exclusivity [J].
Bradley, Jonathan R. ;
Farnsworth, David L. .
JOURNAL OF APPLIED STATISTICS, 2009, 36 (11) :1307-1314
[7]   Path coupling: A technique for proving rapid mixing in Markov chains [J].
Bubley, R ;
Dyer, M .
38TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 1997, :223-231
[8]   Automated Network Analysis Identifies Core Pathways in Glioblastoma [J].
Cerami, Ethan ;
Demir, Emek ;
Schultz, Nikolaus ;
Taylor, Barry S. ;
Sander, Chris .
PLOS ONE, 2010, 5 (02)
[9]  
Chehab NH, 2000, GENE DEV, V14, P278
[10]  
Cheng Y, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P93