Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

最新推荐文章于 2024-07-25 18:39:37 发布

dichunpu6524

最新推荐文章于 2024-07-25 18:39:37 发布

阅读量123

点赞数

文章标签：人工智能

原文链接：http://www.cnblogs.com/zlian2016/p/9520893.html

版权

Link of the Paper: https://arxiv.org/abs/1711.09151

Motivation:

LSTM units are complex and inherently sequential across time.
Convolutional networks have shown advantages on machine translation and conditional image generation.

Innovation:

The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:

Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:

Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
An illustration of a classical RNN architecture for image captioning is provided below.

posted on 2018-08-22 22:39 LZ_Jaja 阅读( ...) 评论( ...) 编辑收藏

转载于:https://www.cnblogs.com/zlian2016/p/9520893.html

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

关注关注