Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks

一. 基本信息

标题:Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
时间:2016

出版源:CVPR

领域分类:video captioning

二. 研究背景

问题定义:given a video, generate a paragraph(multiple sentences)

难点:inter-sentence dependency and a paragraph is inherently hierarchical.

三. 创新方法

  1. Framework:
    (A) sentence generator —RNN
    (B) paragraph generator —RNN

四. 实验

dataset:

  • YouTube2Text

open-domain
1,970 videos, ~80k video-sentence pairs, 12k unique words > only one sentence for a video (special case)

  • TACoS-MultiLevel

closed-domain: cooking
173 videos, 16,145 intervals, ~40k interval-sentence pairs, 2k unique words > several dependent sentences for a video

evaluation metrics:

  • BLEU
  • METEOR
  • CIDEr

五. Conclusions & Discussions
Hierarchical RNN improves paragraph generation
Issues:

  1. Most errors occur when generating nouns; small objects hard
    to recognize (on TACoS-MultiLevel)
  2. One-way information flow
  3. Language model helps, but sometimes overrides computer vision result in a wrong way
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值