MMSplice: modular modeling improves the predictions of genetic variant effects on splicing

被引:141
作者
Cheng, Jun [1 ,2 ]
Thi Yen Duong Nguyen [1 ]
Cygan, Kamil J. [3 ,4 ]
Celik, Muhammed Hasan [1 ]
Fairbrother, William G. [3 ,4 ]
Avsec, Ziga [1 ,2 ]
Gagneur, Julien [1 ]
机构
[1] Tech Univ Munich, Dept Informat, Boltzmannstr, D-85748 Garching, Germany
[2] Ludwig Maximilians Univ Munchen, Grad Sch Quantitat Biosci QBM, Munich, Germany
[3] Brown Univ, Ctr Computat Mol Biol, Providence, RI 02912 USA
[4] Brown Univ, Dept Mol Biol Cell Biol & Biochem, Providence, RI 02912 USA
关键词
Splicing; Variant effect; Variant pathogenicity; Deep learning; Modular modeling; REGULATORY ELEMENTS; SEQUENCE MOTIFS; RNA; VERTEBRATE; ENHANCERS; DESIGN; CODE;
D O I
10.1186/s13059-019-1653-z
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Predicting the effects of genetic variants on splicing is highly relevant for human genetics. We describe the framework MMSplice (modular modeling of splicing) with which we built the winning model of the CAGI5 exon skipping prediction challenge. The MMSplice modules are neural networks scoring exon, intron, and splice sites, trained on distinct large-scale genomics datasets. These modules are combined to predict effects of variants on exon skipping, splice site choice, splicing efficiency, and pathogenicity, with matched or higher performance than state-of-the-art. Our models, available in the repository Kipoi, apply to variants including indels directly from VCF files.
引用
收藏
页数:15
相关论文
共 61 条
[1]   Vex-seq: high-throughput identification of the impact of genetic variation on pre-mRNA splicing efficiency [J].
Adamson, Scott I. ;
Zhan, Lijun ;
Graveley, Brenton R. .
GENOME BIOLOGY, 2018, 19
[2]  
[Anonymous], ZENODO
[3]  
[Anonymous], CLINVAR PUBLIC ARCH
[4]  
[Anonymous], MULTIPLEXED ASSAY EX
[5]  
[Anonymous], MMSPLICE MODULAR MOD
[6]  
[Anonymous], 2018, bioRxiv, page
[7]  
[Anonymous], 1997, CAMBRIDGE SERIES STA
[8]  
[Anonymous], VEX SEQ HIGH THROUGH
[9]   The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans [J].
Ardlie, Kristin G. ;
DeLuca, David S. ;
Segre, Ayellet V. ;
Sullivan, Timothy J. ;
Young, Taylor R. ;
Gelfand, Ellen T. ;
Trowbridge, Casandra A. ;
Maller, Julian B. ;
Tukiainen, Taru ;
Lek, Monkol ;
Ward, Lucas D. ;
Kheradpour, Pouya ;
Iriarte, Benjamin ;
Meng, Yan ;
Palmer, Cameron D. ;
Esko, Tonu ;
Winckler, Wendy ;
Hirschhorn, Joel N. ;
Kellis, Manolis ;
MacArthur, Daniel G. ;
Getz, Gad ;
Shabalin, Andrey A. ;
Li, Gen ;
Zhou, Yi-Hui ;
Nobel, Andrew B. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Lappalainen, Tuuli ;
Ferreira, Pedro G. ;
Ongen, Halit ;
Rivas, Manuel A. ;
Battle, Alexis ;
Mostafavi, Sara ;
Monlong, Jean ;
Sammeth, Michael ;
Mele, Marta ;
Reverter, Ferran ;
Goldmann, Jakob M. ;
Koller, Daphne ;
Guigo, Roderic ;
McCarthy, Mark I. ;
Dermitzakis, Emmanouil T. ;
Gamazon, Eric R. ;
Im, Hae Kyung ;
Konkashbaev, Anuar ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Flutre, Timothee ;
Wen, Xiaoquan ;
Stephens, Matthew .
SCIENCE, 2015, 348 (6235) :648-660
[10]   Deciphering the splicing code [J].
Barash, Yoseph ;
Calarco, John A. ;
Gao, Weijun ;
Pan, Qun ;
Wang, Xinchen ;
Shai, Ofer ;
Blencowe, Benjamin J. ;
Frey, Brendan J. .
NATURE, 2010, 465 (7294) :53-59