一. 基本信息
标题:Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
时间:2016
出版源:CVPR
领域分类:video captioning
二. 研究背景
问题定义:given a video, generate a paragraph(multiple sentences)
难点:inter-sentence dependency and a paragraph is inherently hierarchical.
三. 创新方法
- Framework:
(A) sentence generator —RNN
(B) paragraph generator —RNN
四. 实验
dataset:
- YouTube2Text
open-domain
1,970 videos, ~80k video-sentence pairs, 12k unique words > only one sentence for a video (special case)
- TACoS-MultiLevel
closed-domain: cooking
173 videos, 16,145 intervals, ~40k interval-sentence pairs, 2k unique words > several dependent sentences for a video