EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation

被引:57
作者
Amidi, Afshine [1 ,2 ]
Amidi, Shervine [2 ]
Vlachakis, Dimitrios [3 ]
Megalooikonomou, Vasileios [3 ]
Paragios, Nikos [2 ]
Zacharaki, Evangelia, I [2 ,3 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Ecole Cent Paris, Cent Supelec, Dept Appl Math, Ctr Visual Comp, Chatenay Malabry, France
[3] Univ Patras, Dept Comp Engn & Informat, MDAKM Grp, Patras, Greece
来源
PEERJ | 2018年 / 6卷
关键词
Deep learning; 3D convolutional neural networks; EnzyNet; Enzyme classification; PROTEIN; ARCHITECTURES; PREDICTION; BIOLOGY;
D O I
10.7717/peerj.4750
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank (PDB) has increased more than 15-fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence, however, is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The two-layer architecture was investigated on a large dataset of 63,558 enzymes from the PDB and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet.
引用
收藏
页数:18
相关论文
共 50 条
[1]  
Abadi M., 2016, TENSORFLOW LARGESCAL
[2]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[3]  
Amidi A, 2016, MACHINE LEARNING MET, P728
[4]   Automatic single- and multi-label enzymatic function prediction by machine learning [J].
Amidi, Shervine ;
Amidi, Afshine ;
Vlachakis, Dimitrios ;
Paragios, Nikos ;
Zacharaki, Evangelia I. .
PEERJ, 2017, 5
[5]   Deep learning for computational biology [J].
Angermueller, Christof ;
Parnamaa, Tanel ;
Parts, Leopold ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2016, 12 (07)
[6]  
[Anonymous], CONVOLUTIONAL LSTM N, DOI DOI 10.1007/978-3-319-21233-3_6
[7]  
[Anonymous], 2016, Proc. ICLR
[8]  
[Anonymous], 2017, CORR
[9]   Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics [J].
Asgari, Ehsaneddin ;
Mofrad, Mohammad R. K. .
PLOS ONE, 2015, 10 (11)
[10]  
Baldi P, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P25