Deep Multimodal Learning A survey on recent advances and trends

被引:634
作者
Ramachandram, Dhanesh [1 ,2 ]
Taylor, Graham W. [2 ,3 ,4 ]
机构
[1] Univ Sains Malaysia, George Town, Malaysia
[2] Univ Guelph, Guelph, ON, Canada
[3] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[4] Canadian Inst Adv Res, Toronto, ON, Canada
关键词
NEURAL-NETWORKS; FUSION; ALGORITHMS;
D O I
10.1109/MSP.2017.2738401
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The success of deep learning has been a catalyst to solving increasingly complex machine-learning problems, which often involve multiple data modalities. We review recent advances in deep multimodal learning and highlight the state-of the art, as well as gaps and challenges in this active research field. We first classify deep multimodal learning architectures and then discuss methods to fuse learned multimodal representations in deep-learning architectures. We highlight two areas of research-regularization strategies and methods that learn or optimize multimodal fusion structures-as exciting areas for future work.
引用
收藏
页码:96 / 108
页数:13
相关论文
共 103 条
[71]  
Ngiam Jiquan, 2011, P 28 INT C MACH LEAR, P689
[72]  
Ofli F, 2013, IEEE WORK APP COMP, P53, DOI 10.1109/WACV.2013.6474999
[73]   Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition [J].
Ordonez, Francisco Javier ;
Roggen, Daniel .
SENSORS, 2016, 16 (01)
[74]  
Owens A., 2016, Ambient Sound Provides Supervision for Visual Learning, P801
[75]  
Poria S., 2015, P C EMP METH NAT LAN, DOI [10.18653/v1/d15-1303, 10.18653/v1/D15-1303]
[76]   Towards Multimodal Deep Learning for Activity Recognition on Mobile Devices [J].
Radu, Valentin ;
Lane, Nicholas D. ;
Bhattacharya, Sourav ;
Mascolo, Cecilia ;
Marina, Mahesh K. ;
Kawsar, Fahim .
UBICOMP'16 ADJUNCT: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING, 2016, :185-188
[77]   PRUNING ALGORITHMS - A SURVEY [J].
REED, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1993, 4 (05) :740-747
[78]  
Reed S, 2016, PR MACH LEARN RES, V48
[79]  
Ringeval F, 2013, IEEE INT CONF AUTOMA
[80]  
Sebe N., 2005, COMP IMAG VIS, V29, DOI 10.1007/1-4020-3275-7