Deep learning of the tissue-regulated splicing code

被引:321
作者
Leung, Michael K. K. [1 ,2 ]
Xiong, Hui Yuan [1 ,2 ]
Lee, Leo J. [1 ,2 ]
Frey, Brendan J. [1 ,2 ,3 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G4, Canada
[2] Univ Toronto, Banting & Best Dept Med Res, Toronto, ON M5S 3E1, Canada
[3] Canadian Inst Adv Res, Toronto, ON M5G 1Z8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
ARCHITECTURES; PREDICTION;
D O I
10.1093/bioinformatics/btu277
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Alternative splicing (AS) is a regulated process that directs the generation of different transcripts from single genes. A computational model that can accurately predict splicing patterns based on genomic features and cellular context is highly desirable, both in understanding this widespread phenomenon, and in exploring the effects of genetic variations on AS. Methods: Using a deep neural network, we developed a model inferred from mouse RNA-Seq data that can predict splicing patterns in individual tissues and differences in splicing patterns across tissues. Our architecture uses hidden variables that jointly represent features in genomic sequences and tissue types when making predictions. A graphics processing unit was used to greatly reduce the training time of our models with millions of parameters. Results: We show that the deep architecture surpasses the performance of the previous Bayesian method for predicting AS patterns. With the proper optimization procedure and selection of hyperparameters, we demonstrate that deep architectures can be beneficial, even with a moderately sparse dataset. An analysis of what the model has learned in terms of the genomic features is presented.
引用
收藏
页码:121 / 129
页数:9
相关论文
共 26 条
[21]  
Simonyan K., 2014, WORKSH TRACK P 2 INT
[22]   Regression shrinkage and selection via the lasso: a retrospective [J].
Tibshirani, Robert .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2011, 73 :273-282
[23]  
Tieleman T., 2010, TECHNICAL REPORT
[24]   Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle? [J].
Touw, Wouter G. ;
Bayjanov, Jumamurat R. ;
Overmars, Lex ;
Backus, Lennart ;
Boekhorst, Jos ;
Wels, Michiel ;
van Hijum, Sacha A. F. T. .
BRIEFINGS IN BIOINFORMATICS, 2013, 14 (03) :315-326
[25]   Splicing regulation: From a parts list of regulatory elements to an integrated splicing code [J].
Wang, Zefeng ;
Burge, Christopher B. .
RNA, 2008, 14 (05) :802-813
[26]   Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context [J].
Xiong, Hui Yuan ;
Barash, Yoseph ;
Frey, Brendan J. .
BIOINFORMATICS, 2011, 27 (18) :2554-2562