视觉问答系统用什么服务器比较好,基于深度学习的视觉问答系统

中图分类号:TP391.41;TP18         文献标识码:A         文章编号:2096-4706(2019)11-0011-04

Visual Question Answering System Based on Deep Learning

GE Mengying,SUN Baoshan

(School of Computer Science and Technology,Tianjin Polytechnic University,Tianjin 300387,China)

Abstract:With the development of the internet,the amount of information available to human beings increases exponentially,and the amount of knowledge we can get from the data also increases greatly. Artificial intelligence,which had been put on hold,is radiate vitality. With the continuous development of artificial intelligence, in recent years,visual question answer (VQA) hasemerged as a hot topic in the field of artificial intelligence. Visual question answer (VQA) system needs to take pictures and questions asinput and combine these two parts of information to produce a human language as output. The key solution for VQA is how to fuse visualand linguistic features extracted from input images and questions. This paper focuses on the visual question and answer,summarizesthe research progress in recent years from the aspects of concept and model,and discusses the existing deficiencies. Finally,the futureresearch direction of VQA are prospected.

Keywords:deep learning;artificial intelligence;visual question answer;natural language processing

参考文献:

[1] Malinowski M,Fritz M . A Multi-World Approach to QuestionAnswering about Real-World Scenes based on Uncertain Input [J].OALib Journal,2014.

[2] Lu J,Yang J,Batra D,et al. Hierarchical Question-ImageCo-Attention for Visual Question Answering [C].30th Conference onNeural Information Processing Systems(NIPS) in 2016,Barcelona,Spain,2016.

[3] Yu D,Fu J,Mei T,et al. Multi-level Attention Networks forVisual Question Answering [C]// 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR). IEEE,2017.

[4] Yu Z,Yu J,Fan J,et al. Multi-modal Factorized BilinearPooling with Co-Attention Learning for Visual Question Answering [J].2017 IEEE International Conference on Computer Vision,2017(1):1839-1848.

[5] Fukui A,Park D H,Yang D,et al. Multimodal CompactBilinear Pooling for Visual Question Answering and Visual Grounding [J].ScienceOpen,2016:457-468.

[6] He K,Zhang X,Ren S,et al. Deep ResidualLearning for Image Recognition [J].2016 IEEE Conferenceon Computer Vision and Pattern Recognition,2016(1):770-778.

[7] Deng J,Dong W,Socher R,et al. ImageNet:a Large-Scale Hierarchical Image Database [C]// 2009 IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPR2009),20-25 June 2009,Miami,Florida,USA. IEEE,2009.

[8] Nguyen D K,Okatani T. Improved Fusion of Visual andLanguage Representations by Dense Symmetric Co-Attention for VisualQuestion Answering [J/OL].https://arxiv.org/pdf/1804.00775.pdf,2018.

[9] Antol S,Agrawal A,Lu J,et al. VQA:Visual QuestionAnswering [J].International Journal of Computer Vision,2017,123(1):4-31.

[10] Zhou B,Tian Y,Sukhbaatar S,et al. Simple Baseline forVisual Question Answering [J].Computer Science,2015.

作者简介:

葛梦颖(1996.12-),女,汉族,安徽宿州人,硕士研究生,研究方向:自然语言处理、深度学习等。

孙宝山(1978.10-),男,汉族,天津人,副教授,硕士生导师,工学博士,研究方向:机器学习、自然语言处理等。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值