Reliable Identification of Genomic Variants from RNA-Seq Data

被引:240
作者
Piskol, Robert [1 ]
Ramaswami, Gokul [1 ]
Li, Jin Billy [1 ]
机构
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
基金
美国国家卫生研究院;
关键词
ACCURATE IDENTIFICATION; MUTATIONAL EVOLUTION; SPLICE JUNCTIONS; READ ALIGNMENT; TRANSCRIPTOME; DNA; POLYMORPHISM; MECHANISMS; EXPRESSION; LANDSCAPE;
D O I
10.1016/j.ajhg.2013.08.008
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Identifying genomic variation is a crucial step for unraveling the relationship between genotype and phenotype and can yield important insights into human diseases. Prevailing methods rely on cost-intensive whole-genome sequencing (WGS) or whole-exome sequencing (WES) approaches while the identification of genomic variants from often existing RNA sequencing (RNA-seq) data remains a challenge because of the intrinsic complexity in the transcriptome. Here, we present a highly accurate approach termed SNPiR to identify SNPs in RNA-seq data. We applied SNPiR to RNA-seq data of samples for which WGS and WES data are also available and achieved high specificity and sensitivity. Of the SNPs called from the RNA-seq data, >98% were also identified by WGS or WES. Over 70% of all expressed coding variants were identified from RNA-seq, and comparable numbers of exonic variants were identified in RNA-seq and WES. Despite our method's limitation in detecting variants in expressed regions only, our results demonstrate that SNPiR outperforms current state-of-the-art approaches for variant detection from RNA-seq data and offers a cost-effective and reliable alternative for SNP discovery.
引用
收藏
页码:641 / 651
页数:11
相关论文
共 53 条
[21]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[22]   Widespread RNA and DNA Sequence Differences in the Human Transcriptome [J].
Li, Mingyao ;
Wang, Isabel X. ;
Li, Yun ;
Bruzel, Alan ;
Richards, Allison L. ;
Toung, Jonathan M. ;
Cheung, Vivian G. .
SCIENCE, 2011, 333 (6038) :53-58
[23]  
Lin W, 2012, SCIENCE, V335, DOI 10.1126/science.1210624
[24]   Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events [J].
Liu, Jinfeng ;
Lee, William ;
Jiang, Zhaoshi ;
Chen, Zhongqiang ;
Jhunjhunwala, Suchit ;
Haverty, Peter M. ;
Gnad, Florian ;
Guan, Yinghui ;
Gilbert, Houston N. ;
Stinson, Jeremy ;
Klijn, Christiaan ;
Guillory, Joseph ;
Bhatt, Deepali ;
Vartanian, Steffan ;
Walter, Kimberly ;
Chan, Jocelyn ;
Holcomb, Thomas ;
Dijkgraaf, Peter ;
Johnson, Stephanie ;
Koeman, Julie ;
Minna, John D. ;
Gazdar, Adi F. ;
Stern, Howard M. ;
Hoeflich, Klaus P. ;
Wu, Thomas D. ;
Settleman, Jeff ;
de Sauvage, Frederic J. ;
Gentleman, Robert C. ;
Neve, Richard M. ;
Stokoe, David ;
Modrusan, Zora ;
Seshagiri, Somasekar ;
Shames, David S. ;
Zhang, Zemin .
GENOME RESEARCH, 2012, 22 (12) :2315-2327
[25]   A Systematic Survey of Loss-of-Function Variants in Human Protein-Coding Genes [J].
MacArthur, Daniel G. ;
Balasubramanian, Suganthi ;
Frankish, Adam ;
Huang, Ni ;
Morris, James ;
Walter, Klaudia ;
Jostins, Luke ;
Habegger, Lukas ;
Pickrell, Joseph K. ;
Montgomery, Stephen B. ;
Albers, Cornelis A. ;
Zhang, Zhengdong D. ;
Conrad, Donald F. ;
Lunter, Gerton ;
Zheng, Hancheng ;
Ayub, Qasim ;
DePristo, Mark A. ;
Banks, Eric ;
Hu, Min ;
Handsaker, Robert E. ;
Rosenfeld, Jeffrey A. ;
Fromer, Menachem ;
Jin, Mike ;
Mu, Xinmeng Jasmine ;
Khurana, Ekta ;
Ye, Kai ;
Kay, Mike ;
Saunders, Gary Ian ;
Suner, Marie-Marthe ;
Hunt, Toby ;
Barnes, If H. A. ;
Amid, Clara ;
Carvalho-Silva, Denise R. ;
Bignell, Alexandra H. ;
Snow, Catherine ;
Yngvadottir, Bryndis ;
Bumpstead, Suzannah ;
Cooper, David N. ;
Xue, Yali ;
Romero, Irene Gallego ;
Wang, Jun ;
Li, Yingrui ;
Gibbs, Richard A. ;
McCarroll, Steven A. ;
Dermitzakis, Emmanouil T. ;
Pritchard, Jonathan K. ;
Barrett, Jeffrey C. ;
Harrow, Jennifer ;
Hurles, Matthew E. ;
Gerstein, Mark B. .
SCIENCE, 2012, 335 (6070) :823-828
[26]  
Marco-Sola S, 2012, NAT METHODS, V9, P1185, DOI [10.1038/NMETH.2221, 10.1038/nmeth.2221]
[27]   The functional spectrum of low-frequency coding variation [J].
Marth, Gabor T. ;
Yu, Fuli ;
Indap, Amit R. ;
Garimella, Kiran ;
Gravel, Simon ;
Leong, Wen Fung ;
Tyler-Smith, Chris ;
Bainbridge, Matthew ;
Blackwell, Tom ;
Zheng-Bradley, Xiangqun ;
Chen, Yuan ;
Challis, Danny ;
Clarke, Laura ;
Ball, Edward V. ;
Cibulskis, Kristian ;
Cooper, David N. ;
Fulton, Bob ;
Hartl, Chris ;
Koboldt, Dan ;
Muzny, Donna ;
Smith, Richard ;
Sougnez, Carrie ;
Stewart, Chip ;
Ward, Alistair ;
Yu, Jin ;
Xue, Yali ;
Altshuler, David ;
Bustamante, Carlos D. ;
Clark, Andrew G. ;
Daly, Mark ;
DePristo, Mark ;
Flicek, Paul ;
Gabriel, Stacey ;
Mardis, Elaine ;
Palotie, Aarno ;
Gibbs, Richard .
GENOME BIOLOGY, 2011, 12 (09)
[28]   Genetic Heterogeneity in Human Disease [J].
McClellan, Jon ;
King, Mary-Claire .
CELL, 2010, 141 (02) :210-217
[29]   The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data [J].
McKenna, Aaron ;
Hanna, Matthew ;
Banks, Eric ;
Sivachenko, Andrey ;
Cibulskis, Kristian ;
Kernytsky, Andrew ;
Garimella, Kiran ;
Altshuler, David ;
Gabriel, Stacey ;
Daly, Mark ;
DePristo, Mark A. .
GENOME RESEARCH, 2010, 20 (09) :1297-1303
[30]   Transcriptome genetics using second generation sequencing in a Caucasian population [J].
Montgomery, Stephen B. ;
Sammeth, Micha ;
Gutierrez-Arcelus, Maria ;
Lach, Radoslaw P. ;
Ingle, Catherine ;
Nisbett, James ;
Guigo, Roderic ;
Dermitzakis, Emmanouil T. .
NATURE, 2010, 464 (7289) :773-U151