共 50 条
[1]
Akula Arjun., 2020, P 58 ANN M ASS COMP, P6555
[2]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[3]
[Anonymous], 2014, T ASSOC COMPUT LING
[4]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[5]
Brown Tom, 2020, ADV NEURAL INFORM PR
[6]
Cao J., 2020, ECCV, P565
[7]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]
Dollar P., 2015, CoRR
[9]
Duygulu P, 2002, LECT NOTES COMPUT SC, V2353, P97
[10]
Faghri Fartash, 2017, BMVC