论文笔记:Aligning where to see and what to tell: image caption with region-based attention ...

最新推荐文章于 2022-12-12 14:23:28 发布

zdcs

最新推荐文章于 2022-12-12 14:23:28 发布

阅读量1.3k

点赞数

分类专栏：论文笔记计算机视觉自然语言处理深度学习文章标签：人工智能 VQA 图像处理自然语言处理计算机视觉

本文链接：https://blog.csdn.net/zdcs/article/details/54882423

版权

自然语言处理同时被 3 个专栏收录

20 篇文章 1 订阅

订阅专栏

深度学习

19 篇文章 1 订阅

订阅专栏

论文笔记

8 篇文章 0 订阅

订阅专栏

Aligning where to see and what to tell: image caption with region-based attention and scene factorization

rXiv:1506.06272v1 [cs.CV] 20 Jun 2015

摘要部分:

本文提出一种图像文字标注系统利用了图像与句子之间的平行结构

下面翻译的不好，附原文

In our model, the process of generating the next word, given the previously generated ones, is aligned with the visual perception experience where the attention shifting among the visual regions imposes a thread of visual ordering. This alignment characterizes the ﬂow of “abstract meaning”, encoding what is semantically shared by both the visual scene and the text description. Our system also makes another novel modeling contribution by introducing scene-speciﬁc contexts that capture higher-level semantic information encoded in an image. The contexts adapt language models for word generation to speciﬁc scene types.

在该模型中，在给定前一个生成的词情况下产生下一个词的过程中，与视觉感知信息体验对齐，该视觉感知体验位于在视觉区域之间移动注意力时产生的一连串视觉顺序该对齐刻画了‘抽象含义’流，对同时被视觉场景和文字描述在语义上共享的信息编码
通过引入特定场景上下文，俘获图像中高级语义信息编码，本系统提出了另一种新模型。
该上下文自适应语言模型以便针对特定场景类型生成词。

接下来作者鼓吹效果.....

占位，持续更新ing

zdcs

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
论文笔记:Aligning where to see and what to tell: image caption with region-based attention ...

Aligning where to see and what to tell: image caption with region-based attention and scene factorizationrXiv:1506.06272v1 [cs.CV] 20 Jun 2015摘要部分:本文提出一种图像文字标注系统利用了图像与句子之间的平行结构在该模型中，
复制链接

扫一扫