行为识别论文笔记|ARTNet|Appearance-and-Relation Networks for Video Classification

ARTNet论文笔记介绍了Wang等人提出的一种新型视频分类架构,它结合了外观和关系分支来增强时空表示。相比两流CNN和3D CNN,ARTNet通过SMART块在减少计算消耗的同时提高准确性,尤其是对于局部特征的建模。实验表明,它在Kinetics训练和UCF101、HMDB测试集上表现良好,但可能在时序建模效率上不如3D CNN,并且没有使用残差结构可能导致深层网络时序信息减弱。
摘要由CSDN通过智能技术生成

行为识别论文笔记-ARTNet-Appearance-and-Relation Networks for Video Classification

Wang, Limin, et al. “Appearance-and-relation networks for video classification.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.

Motivation

3 kinds of architectures for video classification: (1) two-stream CNNs (time-consuming, optical flow in advance) (2) 3D CNNs (worse than two stream) and (3) 2D CNNs with temporal models on top such as LSTM, temporal convolution, sparse sampling and aggregation, and attention modeling. (worse in local spatiotemporal representation)

multiplicative interactions to model relation between different views: Gated Boltzmann machines, Energy models, Independent Subspace Analysis (ISA)(similar to Energy mod

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值