caffe样例_R-CNN detection基于Ubuntu_Caffe

本文链接：https://blog.csdn.net/dawin_2008/article/details/52402115

R-CNN detection

1.背景

R-CNN是通过Caffe微调模型拟分类区域的检测器，详细内容可以参考项目网址和论文：

Rich feature hierarchies for accurate object detection and semantic segmentation. Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. CVPR 2014. Arxiv 2013.

2.实现

2.1.Selective Search

Selective Search是R-CNN的区域拟提。selective_search_ijcv_with_pythonPython模块负责通过selective search MATLAB实现来提取拟提。先下载模块放置入caffe的根目录中（否则找不到模块中的函数）并重命名为selective_search_ijcv_with_python
$ cd ~/selectiveselective_search_ijcv_with_python
$ matlab
在MATLAB中运行demo.m编译必要的函数，并添加PYTHONPATH
$ export PYTHONPATH=/path/to/caffe/selectiveselective_search_ijcv_with_python

2.2.下载Caffe R-CNN ImageNet model

$ ./scripts/download_model_binary.py models/bvlc_reference_rcnn_ilsvrc13

2.3.产生h5文件(包含DataFrame , selected windows, and their detection scores)

$ mkdir -p _temp #建立临时文件
$ echo /path/to/caffe/images/fish-bike.jpg > _temp/det_input.txt #图像路径
$ ./python/detect.py --crop_mode=selective_search --pretrained_model=./models/bvlc_reference_rcnn_ilsvrc13/bvlc_reference_rcnn_ilsvrc13.caffemodel --model_def=./models/bvlc_reference_rcnn_ilsvrc13/deploy.prototxt --gpu --raw_scale=255 _temp/det_input.txt _temp/det_output.h5

2.4.数据处理

$ jupyter notebook

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

df = pd.read_hdf('_temp/det_output.h5', 'df')
print(df.shape)
print(df.iloc[0])

输出：

(1570, 5)
prediction    [-2.62247, -2.84579, -2.85122, -3.20838, -1.94...
ymin                                                     79.846
xmin                                                       9.62
ymax                                                     246.31
xmax                                                    339.624
Name: /home/dawin/caffe/examples/images/fish-bike.jpg, dtype: object

注释：1570个区域由selective search的R-CNN配置产生，区域的数量是基于每张图片的内容和大小变化的，因为selective search并不是尺度不变的。

接着，加载ILSVRC13检测类名录和建立一个预测数据表，要使用ILSVRC12辅佐数据，下载ILSVRC12辅佐数据:
$ sudo ./data/ilsvrc12/get_ilsvrc12_aux.sh

with open('./data/ilsvrc12/det_synset_words.txt') as f:
    labels_df = pd.DataFrame([
        {
            'synset_id': l.strip().split(' ')[0],
            'name': ' '.join(l.strip().split(' ')[1:]).split(',')[0]
        }
        for l in f.readlines()
    ])
labels_df.sort_values('synset_id')
predictions_df = pd.DataFrame(np.vstack(df.prediction.values), columns=labels_df['name'])
print(predictions_df.iloc[0])

输出：

name
accordion         -2.622472
airplane          -2.845789
ant               -2.851220
antelope          -3.208377
apple             -1.949951
armadillo         -2.472936
artichoke         -2.201685
axe               -2.327404
baby bed          -2.737925
backpack          -2.176763
bagel             -2.681062
balance beam      -2.722539
banana            -2.390628
band aid          -1.598909
banjo             -2.298197
baseball          -2.311024
basketball        -2.733875
bathing cap       -2.800533
beaker            -2.405005
bear              -3.286877
bee               -2.990223
bell pepper       -2.543806
bench             -1.843543
bicycle           -2.775216
binder            -2.523010
bird              -3.543336
bookshelf         -2.767997
bow tie           -2.532697
bow               -2.003349
bowl              -2.873867
                     ...   
strawberry        -2.559138
stretcher         -2.048682
sunglasses        -2.313832
swimming trunks   -2.062061
swine             -2.795033
syringe           -2.287062
table             -2.234039
tape player       -3.081044
tennis ball       -2.099031
tick              -2.808466
tie               -2.337841
tiger             -2.615820
toaster           -2.382439
traffic light     -2.645303
train             -3.441731
trombone          -2.582362
trumpet           -2.352854
turtle            -2.360860
tv or monitor     -2.761044
unicycle          -2.218469
vacuum            -1.907717
violin            -2.757080
volleyball        -2.723690
waffle iron       -2.418540
washer            -2.408994
water bottle      -2.174899
watercraft        -2.837426
whale             -3.120339
wine bottle       -2.772961
zebra             -2.742914
Name: 0, dtype: float32

查看激活：

plt.gray()
plt.matshow(predictions_df.values)
plt.xlabel('Classes')
plt.ylabel('Windows')

输出:

<matplotlib.text.Text at 0x114f15f90>
<matplotlib.figure.Figure at 0x114254b50>

灰度图
在所有窗口取最大值和绘制顶级类：

max_s = predictions_df.max(0)
max_s.sort(ascending=False)
print(max_s[:10])

输出：

name
person          1.835771
bicycle         0.866110
unicycle        0.057080
motorcycle     -0.006123
banjo          -0.028209
turtle         -0.189833
electric fan   -0.206787
cart           -0.214236
lizard         -0.393519
helmet         -0.477942
dtype: float32

注释：实际上顶部的检测是人和自行车。选取好定位是一项进行中的工作，我们选取顶层得分的人和自行车检测。

# Find, print, and display the top detections: person and bicycle.
i = predictions_df['person'].argmax()
j = predictions_df['bicycle'].argmax()

# Show top predictions for top detection.
f = pd.Series(df['prediction'].iloc[i], index=labels_df['name'])
print('Top detection:')
print(f.order(ascending=False)[:5])
print('')

# Show top predictions for second-best detection.
f = pd.Series(df['prediction'].iloc[j], index=labels_df['name'])
print('Second-best detection:')
print(f.order(ascending=False)[:5])

# Show top detection in red, second-best top detection in blue.
im = plt.imread('examples/images/fish-bike.jpg')
plt.imshow(im)
currentAxis = plt.gca()

det = df.iloc[i]
coords = (det['xmin'], det['ymin']), det['xmax'] - det['xmin'], det['ymax'] - det['ymin']
currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor='r', linewidth=5))

det = df.iloc[j]
coords = (det['xmin'], det['ymin']), det['xmax'] - det['xmin'], det['ymax'] - det['ymin']
currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor='b', linewidth=5))

输出：

Top detection:
name
person             1.835771
swimming trunks   -1.150371
rubber eraser     -1.231106
turtle            -1.266038
plastic bag       -1.303266
dtype: float32

Second-best detection:
name
bicycle     0.866110
unicycle   -0.359140
scorpion   -0.811621
lobster    -0.982891
lamp       -1.096809
dtype: float32

<matplotlib.patches.Rectangle at 0x7f02e8099d10>

效果图

让我们检测所有的 ‘自行车’ 和用NMS使得他们摆脱重叠窗口：

def nms_detections(dets, overlap=0.3):
    """
    Non-maximum suppression: Greedily select high-scoring detections and
    skip detections that are significantly covered by a previously
    selected detection.

    This version is translated from Matlab code by Tomasz Malisiewicz,
    who sped up Pedro Felzenszwalb's code.

    Parameters
    ----------
    dets: ndarray
        each row is ['xmin', 'ymin', 'xmax', 'ymax', 'score']
    overlap: float
        minimum overlap ratio (0.3 default)

    Output
    ------
    dets: ndarray
        remaining after suppression.
    """
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    ind = np.argsort(dets[:, 4])

    w = x2 - x1
    h = y2 - y1
    area = (w * h).astype(float)

    pick = []
    while len(ind) > 0:
        i = ind[-1]
        pick.append(i)
        ind = ind[:-1]

        xx1 = np.maximum(x1[i], x1[ind])
        yy1 = np.maximum(y1[i], y1[ind])
        xx2 = np.minimum(x2[i], x2[ind])
        yy2 = np.minimum(y2[i], y2[ind])

        w = np.maximum(0., xx2 - xx1)
        h = np.maximum(0., yy2 - yy1)

        wh = w * h
        o = wh / (area[i] + area[ind] - wh)

        ind = ind[np.nonzero(o <= overlap)[0]]

    return dets[pick, :]

scores = predictions_df['bicycle']
windows = df[['xmin', 'ymin', 'xmax', 'ymax']].values
dets = np.hstack((windows, scores[:, np.newaxis]))
nms_dets = nms_detections(dets)

显示顶层 3 个图像中NMS检测到的 ‘自行车’ ，并注意顶部的得分框（红色）和其余的框之间的差距。

plt.imshow(im)
currentAxis = plt.gca()
colors = ['r', 'b', 'y']
for c, det in zip(colors, nms_dets[:3]):
    currentAxis.add_patch(
        plt.Rectangle((det[0], det[1]), det[2]-det[0], det[3]-det[1],
        fill=False, edgecolor=c, linewidth=5)
    )
print 'scores:', nms_dets[:3, 4]

输出：

scores: [ 0.86610961 -0.70051467 -1.34796405]

这里写图片描述