解读flow-guided feature aggregation for video object detection

最新推荐文章于 2024-08-10 08:48:36 发布

我是家家

最新推荐文章于 2024-08-10 08:48:36 发布

阅读量1.8k

点赞数

分类专栏：菜鸟从零开始学习Deep learning

本文链接：https://blog.csdn.net/yihaizhiyan/article/details/78662434

版权

菜鸟从零开始学习Deep learning 专栏收录该内容

87 篇文章 1 订阅

订阅专栏

文章主要贡献点：

Flow-guided feature aggregation, an end-to-end framework for video object detection.

Improve the per-frame features by aggregation of nearby features along the motion path, and thus improve the video recognition accuracy.

Or improve the per-frame feature learning by temporal aggregation

数据库：ImageNet VID dataset

3862 video snippet from the traning set

555 snippets from the validation set

Fully annotated

30 object categories (a subset of the categories in the ImageNet DET dataset),

相关工作：

本文工作：

1. the feature extraction network is applied on individual frames to produce the per-frame feature maps

2. To enhance the features at a reference frame, an optical flow network [flownet] estimates themotions between the nearby frames an the reference frame.

3. The feature maps from nearby frames are warped to the reference maps, as well as its own feature maps on the reference frame, areaggregated according to an adaptive weighting network.

4. The resulting aggregated feature maps are then fed to the detection network to produce the detection result on the reference frame.

System: Feature extraction + flow estimation + feature aggregation + detection