Deep Multimodal Learning A survey on recent advances and trends

被引:580
作者
Ramachandram, Dhanesh [1 ,2 ]
Taylor, Graham W. [2 ,3 ,4 ]
机构
[1] Univ Sains Malaysia, George Town, Malaysia
[2] Univ Guelph, Guelph, ON, Canada
[3] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[4] Canadian Inst Adv Res, Toronto, ON, Canada
关键词
NEURAL-NETWORKS; FUSION; ALGORITHMS;
D O I
10.1109/MSP.2017.2738401
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The success of deep learning has been a catalyst to solving increasingly complex machine-learning problems, which often involve multiple data modalities. We review recent advances in deep multimodal learning and highlight the state-of the art, as well as gaps and challenges in this active research field. We first classify deep multimodal learning architectures and then discuss methods to fuse learned multimodal representations in deep-learning architectures. We highlight two areas of research-regularization strategies and methods that learn or optimize multimodal fusion structures-as exciting areas for future work.
引用
收藏
页码:96 / 108
页数:13
相关论文
共 103 条
  • [1] Deep Multimodal Fusion: A Hybrid Approach
    Amer, Mohamed R.
    Shields, Timothy
    Siddiquie, Behjat
    Tamrakar, Amir
    Divakaran, Ajay
    Chai, Sek
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (2-4) : 440 - 456
  • [2] Amer MR, 2014, IEEE WINT CONF APPL, P556, DOI 10.1109/WACV.2014.6836053
  • [3] [Anonymous], ECCV CHALEARN LOOK P
  • [4] [Anonymous], 2004, COMBINING PATTERN CL
  • [5] [Anonymous], 2014, ADV NEURAL INFORM PR
  • [6] [Anonymous], 2012, P INT C NEUR INF PRO
  • [7] [Anonymous], 2016, P NIPS
  • [8] [Anonymous], P 29 INT C MACH LEAR
  • [9] [Anonymous], 2016, P 30 INT C NEURAL IN
  • [10] [Anonymous], P 25 EUR S ART NEUR