Improving Image Captioning with Conditional Generative Adversarial Nets理解

最新推荐文章于 2021-05-25 14:57:14 发布

dreamweaverccc

最新推荐文章于 2021-05-25 14:57:14 发布

阅读量657

点赞数 2

文章标签：图像标注

本文链接：https://blog.csdn.net/dreamweaverccc/article/details/89892566

版权

Chen C , Mu S , Xiao W , et al. Improving Image Captioning with Conditional Generative Adversarial Nets[J]. 2018.

一、前言

图像标注（image captioning）是一门综合计算机视觉和自然语言处理的深度学习研究，相比于图像分类（image classification）、目标检测（object detection）和语义分割（semantic segmentation）等任务，其更复杂，也更具挑战性。

最早，人们基于encoder-decoder提出CNN-RNN结构进行图像标注，其中CNN用作图像表示 $I$ ，RNN用作句子生成 $G_{\theta}$ ，并使用极大似然估计进行训练 $J_G(\theta)$ ，但存在误差累积问题。

$J_G(\theta) = \frac{1}{N}\sum_{j=1}^{n}\log G_{\theta}(x^j|I^j) = \frac{1}{N}\sum_{j=1}^N\sum_{t=1}^{T_j}\log G_{$

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

关注关注