[深度学习论文笔记][Video Classification] Learning Spatiotemporal Features with 3D Convolutional Networks

Tran, Du, et al. “Learning spatiotemporal features with 3d convolutional networks.” 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015. (Citations: 101).


1 Architecture

This model is 3D VGGNet, basically. It contains 3 × 3 × 3 conv, 2 × 2 × 2 pool. An illustration of 3d convolution can be seen in Fig. 3D convolution preserves the temporal

information of the input signals resulting in an output volume.



2 Results
By using deconv approach, we observe that C3D starts by focusing on appearance in the first few frames and tracks the salient motion in the subsequent frames. Thus 3d CNN
differs from stadard 2d CNN in that it selectively attends to both motion and appearance. Like standard 2d CNN, we can extract video features from 3d CNN. We use fc6 features
in our experiments.


3 References
[1]. http://web.cs.hacettepe.edu.tr/ ̃aykut/classes/spring2016/bil722/slides/w07-conv3d.pdf.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值