解读 Fast-RCNN（1）

最新推荐文章于 2022-07-08 09:49:37 发布

Niuip

最新推荐文章于 2022-07-08 09:49:37 发布

阅读量257

点赞数

分类专栏： paper 文章标签： fast rcnn

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/qq_39732684/article/details/80975359

版权

paper 专栏收录该内容

21 篇文章 2 订阅

订阅专栏

解读 Fast-RCNN（1）

大家都知道，fast-rcnn 用于图像的检测，比如图像里有一只猫，可以通过这个算法，检测到有猫，并且

可以用一个红框框把猫框出来

目标检测，深度估计和语义分割一样，是图像理解这一块，准确讲是 image understanding

来看一下Abstract部分，可以了解到，

1, This paper proposes a Fast Region-based Convolution Network method (Fast R-CNN) for object

detection.

2, Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolution

networks.

另外，fast R-CNN 真的很快，这就不细说了，作者把源代码公开了，这是值得赞扬的

再来看Introduction部分，可以认识一些基本的，概念上的东西，

1, Compared to image classification, object detection is a more challenging task that requires more

complex methods to solve

目标跟踪是复杂的原因在于，

1, Complexity arises because detection requires the accurate localization of objects

2, Numerous candidate object locations must be processed

3, These candidates provide only rough localization that must be refined to achieve

precise localization

4, Solution to these problems often compromise speed, accuracy, or simplicity

这篇文章的贡献是

We propose a single-stage training algorithm that jointly learns to classify object proposals and

refine their spatial locations

作者综述之前的经典方法，比如R-CNN和SPPnet，先总体说一下：

The Region-based Convolution Network (R-CNN) achieves excellent object detection accuracy by using

a deep ConvNet to classify object proposals

R-CNN 自身存在一些问题：

1, Training is a multi-stage pipeline

2, Training is expensive in space and time

3, Object detection is slow

总而言之，R-CNN算法很慢。很慢的原因是，

1, R-CNN is slow because it performs a ConvNet forward pass for each object proposal

without sharing computation

因此，引出了SPPnets的方法，也算是R-CNN的改进，主要在 sharing computation 做文章

Spatial pyramid pooling networks (SPPnets) were proposed to speed up R-CNN by sharing computation

SPPnets的简述，

1, The SPPnet method computes a convolutional feature map for the entire input image and then

classifies each object proposal using a feature vector extracted from the shared feature map

2, Features are extracted for a proposal by max-pooling the portion of the feature map inside the

the proposal into a fixed-size output

3, Multiple output sizes are pooled and then concatenated as in spatial pyramid pooling

我之前没有看过SPPnet，所以不能特别清除它的用处，就仅作了解吧

同时，SPPnet 有一些缺点，不急，听作者的叙述，

1, Like R-CNN, training is a multi-stage pipeline that involves extracting features, fine-tuning a

network with log loss, training SVMs, and finally fitting bounding-box regressors.

2, But unlike R-CNN, the fine-tuning algorithm cannot update the convolutional layers that

precede the spatial pyramid pooling.

3, Features are also written to disk

于是，

Unsurprisingly, this limitation (fixed convolutional layers) limits the accuracy of very deep networks

Contributions

作者说自己的方法可以克服这些困难，

1, Higher detection quality than R-CNN, SPPnet

2, Training is single-stage, using a multi-task loss

3, Training can update all networks layers

4, No disk storage is required for feature caching

下次接着讨论！

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
解读 Fast-RCNN（1）

解读 Fast-RCNN（1）大家都知道，fast-rcnn 用于图像的检测，比如图像里有一只猫，可以通过这个算法，检测到有猫，并且可以用一个红框框把猫框出来目标检测，深度估计和语义分割一样，是图像理解这一块，准确讲是 image understanding来看一下Abstract部分，可以了解到，1, This paper proposes a Fast Region-based Convolu...
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。