Deep learning of the tissue-regulated splicing code

被引:321
作者
Leung, Michael K. K. [1 ,2 ]
Xiong, Hui Yuan [1 ,2 ]
Lee, Leo J. [1 ,2 ]
Frey, Brendan J. [1 ,2 ,3 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G4, Canada
[2] Univ Toronto, Banting & Best Dept Med Res, Toronto, ON M5S 3E1, Canada
[3] Canadian Inst Adv Res, Toronto, ON M5G 1Z8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
ARCHITECTURES; PREDICTION;
D O I
10.1093/bioinformatics/btu277
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Alternative splicing (AS) is a regulated process that directs the generation of different transcripts from single genes. A computational model that can accurately predict splicing patterns based on genomic features and cellular context is highly desirable, both in understanding this widespread phenomenon, and in exploring the effects of genetic variations on AS. Methods: Using a deep neural network, we developed a model inferred from mouse RNA-Seq data that can predict splicing patterns in individual tissues and differences in splicing patterns across tissues. Our architecture uses hidden variables that jointly represent features in genomic sequences and tissue types when making predictions. A graphics processing unit was used to greatly reduce the training time of our models with millions of parameters. Results: We show that the deep architecture surpasses the performance of the previous Bayesian method for predicting AS patterns. With the proper optimization procedure and selection of hyperparameters, we demonstrate that deep architectures can be beneficial, even with a moderately sparse dataset. An analysis of what the model has learned in terms of the genomic features is presented.
引用
收藏
页码:121 / 129
页数:9
相关论文
共 26 条
[1]  
Ahn S, 2012, P 29 INT C MACHINE L
[2]  
[Anonymous], 2012, P NEUR INF PROC SYST
[3]  
[Anonymous], 2012, 12070580 ARXIV
[4]   AVISPA: a web tool for the prediction and analysis of alternative splicing [J].
Barash, Yoseph ;
Vaquero-Garcia, Jorge ;
Gonzalez-Vallinas, Juan ;
Xiong, Hui Yuan ;
Gao, Weijun ;
Lee, Leo J. ;
Frey, Brendan J. .
GENOME BIOLOGY, 2013, 14 (10)
[5]   Deciphering the splicing code [J].
Barash, Yoseph ;
Calarco, John A. ;
Gao, Weijun ;
Pan, Qun ;
Wang, Xinchen ;
Shai, Ofer ;
Blencowe, Benjamin J. ;
Frey, Brendan J. .
NATURE, 2010, 465 (7294) :53-59
[6]   The Evolutionary Landscape of Alternative Splicing in Vertebrate Species [J].
Barbosa-Morais, Nuno L. ;
Irimia, Manuel ;
Pan, Qun ;
Xiong, Hui Y. ;
Gueroussov, Serge ;
Lee, Leo J. ;
Slobodeniuc, Valentina ;
Kutter, Claudia ;
Watt, Stephen ;
Colak, Recep ;
Kim, TaeHyung ;
Misquitta-Ali, Christine M. ;
Wilson, Michael D. ;
Kim, Philip M. ;
Odom, Duncan T. ;
Frey, Brendan J. ;
Blencowe, Benjamin J. .
SCIENCE, 2012, 338 (6114) :1587-1593
[7]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[8]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[9]   The evolution of gene expression levels in mammalian organs [J].
Brawand, David ;
Soumillon, Magali ;
Necsulea, Anamaria ;
Julien, Philippe ;
Csardi, Gabor ;
Harrigan, Patrick ;
Weier, Manuela ;
Liechti, Angelica ;
Aximu-Petri, Ayinuer ;
Kircher, Martin ;
Albert, Frank W. ;
Zeller, Ulrich ;
Khaitovich, Philipp ;
Gruetzner, Frank ;
Bergmann, Sven ;
Nielsen, Rasmus ;
Paeaebo, Svante ;
Kaessmann, Henrik .
NATURE, 2011, 478 (7369) :343-+
[10]  
Caruana R., 2006, ACM INT C P SER, P161, DOI [10.1145/1143844.1143865, DOI 10.1145/1143844.1143865]