[深度学习论文笔记][Video Classification] Delving Deeper into Convolutional Networks for Learning Video Repre

最新推荐文章于 2024-08-15 17:20:12 发布

Hao_Zhang_Vision

最新推荐文章于 2024-08-15 17:20:12 发布

阅读量2.1k

点赞数

分类专栏： CNN Papers 文章标签： CNN Computer Vision Deep Learning Papers Video Classification

本文链接：https://blog.csdn.net/Hao_Zhang_Vision/article/details/53201414

版权

本文探讨了深度学习中卷积网络在视频分类任务中的深入应用。研究动机主要集中在全局外观变化上，通过改进模型结构，使CNN中的所有神经元具有循环特性。最后的时间步将空间维度减少到1×1，并将表示传递给5个分类器，由线性层和softmax非线性组成。分类器的输出被平均以得到最终决策。

摘要由CSDN通过智能技术生成

Ballas, Nicolas, et al. “Delving Deeper into Convolutional Networks for Learning Video Representations.” arXiv preprint arXiv:1511.06432 (2015). (Citaions: 14).

1 Motivation

Previous works on Recurrent CNNs has tended to focus on high-level features extracted from the 2D CNN top-layers. High-level features contain highly discriminative informa-
tion, they tend to have a low-spatial resolution. Thus, we argue that current RCN architectures are not well suited for capturing fine motion information. Instead, they are more

likely focus on global appearance changes.

Low-level features, on the other hand, preserve a higher spatial resolution from which we can model finer motion patterns. However, applying an RNN directly on intermediate
convolutional maps, inevitably results in a drastic number of parameters characterizing the input-to-hidden transformation due to the convolutional maps size. On the other hand,