Gene expression inference with deep learning

被引:305
作者
Chen, Yifei [1 ,4 ]
Li, Yi [1 ]
Narayan, Rajiv [2 ]
Subramanian, Aravind [2 ]
Xie, Xiaohui [1 ,3 ]
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[3] Univ Calif Irvine, Ctr Complex Biol Syst, Irvine, CA 92697 USA
[4] Baidu Res Big Data Lab, Beijing 100085, Peoples R China
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
NEURAL-NETWORKS; ARCHITECTURES; MAP;
D O I
10.1093/bioinformatics/btw074
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Motivation: Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only similar to 1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. Results: We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes.
引用
收藏
页码:1832 / 1839
页数:8
相关论文
共 35 条
[1]
[Anonymous], 2010, P PYTH SCI C
[2]
[Anonymous], 2013, ARXIV13084214
[3]
[Anonymous], 2014, THESIS
[4]
The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans [J].
Ardlie, Kristin G. ;
DeLuca, David S. ;
Segre, Ayellet V. ;
Sullivan, Timothy J. ;
Young, Taylor R. ;
Gelfand, Ellen T. ;
Trowbridge, Casandra A. ;
Maller, Julian B. ;
Tukiainen, Taru ;
Lek, Monkol ;
Ward, Lucas D. ;
Kheradpour, Pouya ;
Iriarte, Benjamin ;
Meng, Yan ;
Palmer, Cameron D. ;
Esko, Tonu ;
Winckler, Wendy ;
Hirschhorn, Joel N. ;
Kellis, Manolis ;
MacArthur, Daniel G. ;
Getz, Gad ;
Shabalin, Andrey A. ;
Li, Gen ;
Zhou, Yi-Hui ;
Nobel, Andrew B. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Lappalainen, Tuuli ;
Ferreira, Pedro G. ;
Ongen, Halit ;
Rivas, Manuel A. ;
Battle, Alexis ;
Mostafavi, Sara ;
Monlong, Jean ;
Sammeth, Michael ;
Mele, Marta ;
Reverter, Ferran ;
Goldmann, Jakob M. ;
Koller, Daphne ;
Guigo, Roderic ;
McCarthy, Mark I. ;
Dermitzakis, Emmanouil T. ;
Gamazon, Eric R. ;
Im, Hae Kyung ;
Konkashbaev, Anuar ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Flutre, Timothee ;
Wen, Xiaoquan ;
Stephens, Matthew .
SCIENCE, 2015, 348 (6235) :648-660
[5]
Searching for exotic particles in high-energy physics with deep learning [J].
Baldi, P. ;
Sadowski, P. ;
Whiteson, D. .
NATURE COMMUNICATIONS, 2014, 5
[6]
Baldi P., 2013, Adv. Neural Inf. Process. Syst, V26
[7]
How to infer gene networks from expression profiles [J].
Bansal, Mukesh ;
Belcastro, Vincenzo ;
Ambesi-Impiombato, Alberto ;
di Bernardo, Diego .
MOLECULAR SYSTEMS BIOLOGY, 2007, 3 (1)
[8]
Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[9]
Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[10]
MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517