Two-Stream SR-CNNs for Action Recognition in Videos

最新推荐文章于 2022-07-04 19:22:59 发布

BojackHorseman

最新推荐文章于 2022-07-04 19:22:59 发布

阅读量1.3k

点赞数

分类专栏：论文阅读 cnn deep-learning

本文链接：https://blog.csdn.net/bojackhosreman/article/details/72831799

版权

paper：http://www.bmva.org/bmvc/2016/papers/paper108/index.html
code：https://github.com/yifita/action.sr_cnn
三作主页：http://wanglimin.github.io/

Two-Stream SR-CNNs for Action Recognition in Videos

dataset : UCF101 JHMDB(split 1)
accuracy: 92.6 53.77

framework

输入仍然是双流，但是将RGB和flow都经过了faster-rcnn，得到不同的区域分为了场景、人、物体三类，分别输入网络进行训练。
这里写图片描述

The inputs are first passed through standard convolutional and pooling layers.We replace the last pooling layer with a RoiPooling [2] layer, which separate features for different semantic cues into parallel fully connected layers (called channels) using bounding boxes proposed from a Faster R-CNN [18] object detector (see subsection 3.2).

每个channel都会得到独立的分数，由于有多个物体，作者采用了MIL（(Multiple Instance Learning）来结合最有用的信息。最后所有的score都通过一个fusion layer，得到最终的预测结果。

Fusion

fusion的策略，作者提出了4个：

Max fusing takes the maximum score value among all channels for each class, essentially picking the strongest channel.
Sum fusion directly adds up the scores from different channels, i.e. each channel is treated equal.
Category-wise weighted fusion (Weighted-1) combines channel scores via weighted sum, aiming to represent varied relative contribution of each channel for different classes using their corresponding weights.
As for correlationwise weighted fusion (Weighted-2)

最低0.47元/天解锁文章

BojackHorseman

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Two-Stream SR-CNNs for Action Recognition in Videos

paper：http://www.bmva.org/bmvc/2016/papers/paper108/index.html code：https://github.com/yifita/action.sr_cnn 三作主页：http://wanglimin.github.io/Two-Stream SR-CNNs for Action Recognition in Videosdataset
复制链接

扫一扫

专栏目录