Object Detection《R-CNN》笔记(1)

Rich feature hierarchies for accurate object detection and semantic segmentation

说明:extracts a fixed-length feature vector from each proposal using a CNN, and then classifies each region with category-specific linear SVMs;use unsupervised pre-training+fine-tuning;

Object detection with R-CNN

1,The first generates category-independent region proposals。
2,The second module is a large convolutional neural network that extracts a fixed-length feature vector from each region.
3,The third module is a set of classspecific linear SVMs.
Region proposals:use selective search extract around 2000 region proposals
Feature extraction:extract a 4096-dimensional feature vector from each region proposal;Features are computed by forward propagating a mean-subtracted 227 × 227 RGB image through five convolutional layers and two fully connected layers.
这里写图片描述
Test-time detection
1,we dilate the tight bounding box so that at the warped size there are exactly p pixels of warped image context around the original box (we use p = 16)【在每个建议框周围加上16个像素值为建议框像素平均值的边框,再直接变形为227×227的大小】
2, rejects a region if it has an intersection-over-union (IoU) overlap with a higher scoring selected region larger than a learned threshold。【分别对上述2000×20维矩阵中每一列即每一类进行非极大值抑制剔除重叠建议框,得到该列即该类中得分最高的一些建议框;】
这里写图片描述
Training
Domain-specific fine-tuning:initialized 21-way classification layer (for the 20 VOC classes plus background), the CNN architecture is unchanged. We treat all region proposals with ≥ 0:5 IoU overlap with a ground-truth box as positives for that box’s class and the rest as negatives. We start SGD at a learning rate of 0.001 (1/10th of the initial pre-training rate), which allows fine-tuning to make progress while not clobbering the initialization. In each SGD iteration, we uniformly sample 32 positive windows (over all classes) and 96 background windows to construct a mini-batch of size 128。
这里写图片描述
这里写图片描述

相关概念

1,mAP【mean Average Precision】:给每一类分别计算AP,然后做mean平均;AP是Precision-Recall Curve下面的面积;准确率precision: TP/(TP+FP);召回率recall: TP/(TP+FN)。
2,IoU:= (A∩B)/(A∪B)
这里写图片描述
在测试过程完成到第4步之后,获得2000×20维矩阵表示每个建议框是某个物体类别的得分情况,此时会遇到下图所示情况,同一个车辆目标会被多个建议框包围,这时需要非极大值抑制操作去除得分较低的候选框以减少重叠框。
这里写图片描述

存在问题

1,训练时间很长(84小时)
2,测试阶段很慢
3,复杂的多阶段训练

R-CNN概览

这里写图片描述

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值