【神经网络】目标检测——RCNN

最新推荐文章于 2022-10-21 14:54:49 发布

liuyanlin0102

最新推荐文章于 2022-10-21 14:54:49 发布

阅读量455

点赞数

分类专栏：图像

本文链接：https://blog.csdn.net/liuyanlin0102/article/details/80507470

版权

RCNN目标检测方法通过Selective Search获取Region Proposal，使用预训练的AlexNet提取特征，然后对每个类别建立SVM模型进行评分。非极大抑制用于减少冗余Proposal。训练过程包括预训练和微调，使用128个样本进行训练，正负样本比例为1:3。最终，通过bounding box regression提升定位精度。

摘要由CSDN通过智能技术生成

test阶段：

用Selective Search招两千左右的Rigion Proposal，并且图像四周加16个像素，再wrap最为CNN输入（227*227*3），这个是Alexnet的input。之所以这样是可以很好地利用Alexnet的结果；
用CNN特征提取特征（2000*4096）。网络结构就是AlexNet，输入227*227*3，第五层输出6*6*256，第六层输出：4096，第七层输出4096，模型将第七层的输出为特征
参考Alexnet：
https://blog.csdn.net/zyqdragon/article/details/72353420
对每一个类别分别建立SVM模型，得到评分（2000*20）。这里在选择rigion proposal的时候采用的是非极大抑制（greedy non-maximum suppression）

非极大抑制（greedy non-maximum suppression）：
对于每一类别，从score高的开始，当IoU（Intersection over union）高于阈值，去掉该region proposal，然后再在剩下的中间继续选择，直到遍历所有的score高于某个阈值的region proposal。

训练过程：

pre-training：用ILSVRC 2012的数据集进行训练（Alexnet模型），这个是没有bounding box labels，学习率为0.01
fine-tuning：利用warped region proposal进行训练，在这里是有bbox的
positive：所有和groud-truth box的IoU>0.5的region proposal
negative：剩下的
学习率设置为0.001，比前面小，因为我们是想微调，不太改变pretraining的权重
每次训练用128个样本，正负比例1:3，即正32个，负96个。We bias the sampling towards positive windows because they are extremely rare compared to background.
在建立SVM时
positive：only the ground-truth boxes
negative:proposals with less than 0.3IoUwith all instances of a class(该类的所有ground truth boxes)
Proposals that fall into the grey zone (more than 0.3 IoU overlap, but are not ground truth) are ignored.

paper对比了用第五六七层的输出作为特征，发现在不fine-tuning的情况下，建立的模型mAP（mean average precision）差不多，说明CNN提取的特征主要表现在卷积层（尽管参数少），而用了fine-tuning，fc7、fc6比pool5效果好很多，说明全连接层能够学习到特定任务的样本的特征。

bounding box regression
为了improve localization performance，需要class-specific bounding-box regression。
在对每一个proposal打分之后，用CNN得到的特征对bounding box的位置、大小进行调整，
we only learn from a proposal P if it is nearby at least one ground-truth box. We implement “nearness” by assigning P to the ground-truth box G with which it has maximum IoU overlap (in case it overlaps more than one) if and only if the overlap is greater than a threshold (which we set to 0.6 using a validation set).
对于离得太远的proposal，re不re都没啥意思，所以只选择和某一个ground-truth的IoU大于0.6的训练。
输入：CNN得到的feature
输出：

d x (P), d y (P), d w (P), d h (P)

$d_x(P),d_y(P),d_w(P),d_h(P)$

利用上述输出可以得到调整后的proposal的位置：

G ̂ x = P w

最低0.47元/天解锁文章

liuyanlin0102

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【神经网络】目标检测——RCNN

test阶段：用Selective Search招两千左右的Rigion Proposal，并且图像四周加16个像素，再wrap最为CNN输入（227*227*3），这个是Alexnet的input。之所以这样是可以很好地利用Alexnet的结果；用CNN特征提取特征（2000*4096）。网络结构就是AlexNet，输入227*227*3，第五层输出6*6*256，第六层输出：4096，...
复制链接

扫一扫