[深度学习论文笔记][Video Classification] Learning Spatiotemporal Features with 3D Convolutional Networks

最新推荐文章于 2023-02-11 20:10:04 发布

Hao_Zhang_Vision

最新推荐文章于 2023-02-11 20:10:04 发布

阅读量1.9k

点赞数

分类专栏： CNN Papers 文章标签： CNN Computer Vision Deep Learning Papers Video Classification

本文链接：https://blog.csdn.net/Hao_Zhang_Vision/article/details/53184470

版权

CNN Papers 专栏收录该内容

58 篇文章 1 订阅

订阅专栏

Tran, Du, et al. “Learning spatiotemporal features with 3d convolutional networks.” 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015. (Citations: 101).

1 Architecture

This model is 3D VGGNet, basically. It contains 3 × 3 × 3 conv, 2 × 2 × 2 pool. An illustration of 3d convolution can be seen in Fig. 3D convolution preserves the temporal

information of the input signals resulting in an output volume.

2 Results
By using deconv approach, we observe that C3D starts by focusing on appearance in the first few frames and tracks the salient motion in the subsequent frames. Thus 3d CNN
differs from stadard 2d CNN in that it selectively attends to both motion and appearance. Like standard 2d CNN, we can extract video features from 3d CNN. We use fc6 features
in our experiments.

3 References
[1]. http://web.cs.hacettepe.edu.tr/ ̃aykut/classes/spring2016/bil722/slides/w07-conv3d.pdf.

Hao_Zhang_Vision

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[深度学习论文笔记][Video Classification] Learning Spatiotemporal Features with 3D Convolutional Networks

Tran, Du, et al. “Learning spatiotemporal features with 3d convolutional networks.” 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015. (Citations: 101).1 ArchitectureThi
复制链接

扫一扫