Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods

Tasks

Visual Description Generation

Image Description Generation

Standard Image Description Generation

Dense Image Description Generation:旨在局部目标处生成描述

Image Paragraph Generation:生成段落

Spoken Language Image Description Generation:变写为说

Stylistic Image Description Generation:添加语言风格,例如幽默,

Unseen Objects Image Description Generation:

Diverse Image Description Generation:

Controllable Image Description Generation: control and select the objects in an image to generate descriptions. 

Video Description Generation

Global Video Description Generation: 

Dense Video Description Generation: 类似与Dense Image Description Generation

Movie Description Generation: movie clips are used as input

Visual Storytelling

Image Storytelling:

Video Storytelling:

Visual Question Answering

Image Question Answering

Video Question Answering

Visual Dialog

Image Dialog

Video Dialog

Visual Reasoning

Image Reasoning

Video Reasoning

Video Referring Expression

Image Referring Expression

Video Referring Expression

Visual Entailment

Image Entailment

Language-to-Vision Generation

Language-to-Image Generation
Sentence-level Language-to-Image Generation

Image Manipulation(图像编辑):生通过本文来引导图像的编辑, 同时保持其他文本不相关的区域,另一种方法是交互式的修改图像内容,还有一种是通过对话修改。

Fine-grain Image Generation(细粒度的图像生成):

Sequential Image Generation(序列图像生成):给定一段文字(多个句子),生成一系列的图像,就像故事的可视化,与image storytelling相反。

Language-to-Video Generation

需要更强的条件生成器,因为需要考虑时间维度

Vision-and-Language Navigation

Image and Language Navigation

Multimodal Machine Translation

Machine Translation with Image:将描述一副图像的源语言句子翻译成目标语言。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值