SlowFast泛读【SlowFast Networks for Video Recognition】

最新推荐文章于 2024-08-24 23:56:08 发布

weixin_47341656

最新推荐文章于 2024-08-24 23:56:08 发布

阅读量629

点赞数

分类专栏：论文阅读笔记文章标签：视觉检测神经网络深度学习计算机视觉目标检测

本文链接：https://blog.csdn.net/weixin_47341656/article/details/124286797

版权

论文阅读笔记专栏收录该内容

10 篇文章 3 订阅

订阅专栏

0、前沿

泛读我们主要读文章标题，摘要、结论和图表数据四个部分。需要回答用什么方法，解决什么问题，达到什么效果这三个问题。需要了解更多视频理解相关文章可以关注视频理解系列目录了解当前更新情况。

1、标题

SlowFast Networks for Video Recognition

基于快慢网络的视频识别

2、摘要

We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway, operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating at high frame rate, to capture motion at fine temporal resolution. The Fast pathway can be made very lightweight by reducing its channel capacity, yet can learn useful temporal information for video recognition.

Our models achieve strong performance for both action classification and detection in video, and large improvements are pin-pointed as contributions by our SlowFast concept. We report state-of-the-art accuracy on major video recognition benchmarks, Kinetics, Charades and AVA. Code has been made available at: https://github.com/ facebookresearch/SlowFast

我们提出了用于视频识别的SlowFast网络。我们的模型包括(i)在低帧率下运行的慢通道，以捕捉空间语义信息，以及(ii)在高帧率下运行的快通道，以捕捉更细的时序分辨率的运动信息。快通道可以通过减少其通道容量使其非常轻量级，但可以学习有用的时间信息进行视频识别。

我们的模型在视频中的动作分类和检测中都表现出了强大的性能，而我们的SlowFast概念也被认为是一个很大的改进。我们在主要的视频识别基准（Kinetics, Charades and AVA）上都报告了SOTA精度。代码放在https://github.com/ facebookresearch/SlowFast。

3、结论

The time axis is a special dimension. This paper has investigated an architecture design that contrasts the speed along this axis. It achieves state-of-the-art accuracy for video action classification and detection. We hope that this SlowFast concept will foster further research in video recognition

时间轴是一个特殊的维度。本文研究了一种网络架构，它对比了沿这条轴的速度。它在视频分类和检测上能获取SOTA精度。我们希望SlowFast这个概念将促进视频识别方面的进一步研究。

4、重要图表

图1：一个SlowFast网络具有低帧率、低时间分辨率的慢通道和一个高帧率、α×的高时间分辨率的快通道。快通路可以通过通道的切分变的轻量。横向链接又将它们融合到一起。

表1：一个SlowFast例子。{T x ${_{S}}^{2}$ ,C}分别表示时间，空间和通道大小，步长也一样的格式。这里速率比α = 8，通道比为β = 1/8，τ 是16。在快通道中，绿色标记出了其有更高的速率，橙色标记其有更少的通道。非退化的时间过滤器我们划线标出来了。残差块我们用方括号标出来了，骨干网用的是ResNet-50。