基于深度学习的人脸识别系统系列（Caffe+OpenCV+Dlib）——【三】使用Caffe的MemoryData层与VGG网络模型提取Mat的特征

最新推荐文章于 2024-06-21 16:08:59 发布

IT远征军

最新推荐文章于 2024-06-21 16:08:59 发布

阅读量3k

点赞数

分类专栏： CNN

CNN 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

原文地址：http://m.blog.csdn.net/article/details?id=52456548

前言

基于深度学习的人脸识别系统，一共用到了5个开源库：OpenCV（计算机视觉库）、Caffe（深度学习库）、Dlib（机器学习库）、libfacedetection（人脸检测库）、cudnn（gpu加速库）。
用到了一个开源的深度学习模型：VGG model。
最终的效果是很赞的，识别一张人脸的速度是0.039秒，而且最重要的是：精度高啊！！！
CPU：intel i5-4590
GPU：GTX 980
系统：Win 10
OpenCV版本：3.1（这个无所谓）
Caffe版本：Microsoft caffe （微软编译的Caffe，安装方便，在这里安利一波）
Dlib版本：19.0（也无所谓
CUDA版本：7.5
cudnn版本：4
libfacedetection：6月份之后的（这个有所谓，6月后出了64位版本的）
这个系列纯C++构成，有问题的各位朋同学可以直接在博客下留言，我们互相交流学习。
====================================================================

本篇是该系列的第三篇博客，介绍如何使用VGG网络模型与Caffe的 MemoryData层去提取一个OpenCV矩阵类型Mat的特征。

思路

VGG网络模型是牛津大学视觉几何组提出的一种深度模型，在LFW数据库上取得了97%的准确率。VGG网络由5个卷积层，两层fc图像特征，一层fc分类特征组成，具体我们可以去读它的prototxt文件。这里是模型与配置文件的下载地址。
http://www.robots.ox.ac.uk/~vgg/software/vgg_face/
话题回到Caffe。在Caffe中提取图片的特征是很容易的，其提供了extract_feature.exe让我们来实现，提取格式为lmdb与leveldb。关于这个的做法，可以看我的这篇博客：
http://blog.csdn.net/mr_curry/article/details/52097529
显然，我们在程序中肯定是希望能够灵活利用的，使用这种方法不太可行。Caffe的Data层提供了type：MemoryData，我们可以使用它来进行Mat类型特征的提取。
注：你需要先按照本系列第一篇博客的方法去配置好Caffe的属性表。
http://blog.csdn.net/mr_curry/article/details/52443126

实现

首先我们打开VGG_FACE_deploy.prototxt，观察VGG的网络结构。
这里写图片描述
有意思的是，MemoryData层需要图像均值，但是官方网站上并没有给出mean文件。我们可以通过这种方式进行输入：

    mean_value:129.1863
    mean_value:104.7624
    mean_value:93.5940
 
 1
2
3

我们还需要修改它的data层：（你可以用下面这部分的代码去替换下载下来的prototxt文件的data层）

   layer {
  name: "data"
  type: "MemoryData"
  top: "data"
  top: "label"
  transform_param {
    mirror: false
    crop_size: 224
    mean_value:129.1863
    mean_value:104.7624
    mean_value:93.5940
  }
  memory_data_param {
    batch_size: 1
    channels:3
    height:224
    width:224
  }
}

 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

为了不破坏原来的文件，把它另存为vgg_extract_feature_memorydata.prototxt。
这里写图片描述
好的，然后我们开始编写。添加好这个属性表：

然后，新建caffe_net_memorylayer.h、ExtractFeature_.h、ExtractFeature_.cpp开始编写。
caffe_net_memorylayer.h：

#include "caffe/layers/input_layer.hpp"  
#include "caffe/layers/inner_product_layer.hpp"  
#include "caffe/layers/dropout_layer.hpp"  
#include "caffe/layers/conv_layer.hpp"  
#include "caffe/layers/relu_layer.hpp"  
#include <iostream> 
#include "caffe/caffe.hpp"
#include <opencv.hpp>
#include <caffe/layers/memory_data_layer.hpp>
#include "caffe/layers/pooling_layer.hpp"  
#include "caffe/layers/lrn_layer.hpp"  
#include "caffe/layers/softmax_layer.hpp"  
// must predefined
caffe::MemoryDataLayer<float> *memory_layer;
caffe::Net<float>* net;
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

ExtractFeature_.h

#include <opencv.hpp>
using namespace cv;
using namespace std;

std::vector<float> ExtractFeature(Mat FaceROI);//给一个图片 返回一个vector<float>容器
void Caffe_Predefine();
 
 1
2
3
4
5
6

ExtractFeature_.cpp：

#include <ExtractFeature_.h>
#include <caffe_net_memorylayer.h>
namespace caffe
{
    extern INSTANTIATE_CLASS(InputLayer);
    extern INSTANTIATE_CLASS(InnerProductLayer);
    extern INSTANTIATE_CLASS(DropoutLayer);
    extern INSTANTIATE_CLASS(ConvolutionLayer);
    REGISTER_LAYER_CLASS(Convolution);
    extern INSTANTIATE_CLASS(ReLULayer);
    REGISTER_LAYER_CLASS(ReLU);
    extern INSTANTIATE_CLASS(PoolingLayer);
    REGISTER_LAYER_CLASS(Pooling);
    extern INSTANTIATE_CLASS(LRNLayer);
    REGISTER_LAYER_CLASS(LRN);
    extern INSTANTIATE_CLASS(SoftmaxLayer);
    REGISTER_LAYER_CLASS(Softmax);
    extern INSTANTIATE_CLASS(MemoryDataLayer);
}
template <typename Dtype>
caffe::Net<Dtype>* Net_Init_Load(std::string param_file, std::string pretrained_param_file, caffe::Phase phase)
{
    caffe::Net<Dtype>* net(new caffe::Net<Dtype>("vgg_extract_feature_memorydata.prototxt", caffe::TEST));
    net->CopyTrainedLayersFrom("VGG_FACE.caffemodel");
    return net;
}

void Caffe_Predefine()//when our code begining run must add it
{
    caffe::Caffe::set_mode(caffe::Caffe::GPU);
    net = Net_Init_Load<float>("vgg_extract_feature_memorydata.prototxt", "VGG_FACE.caffemodel", caffe::TEST);
    memory_layer = (caffe::MemoryDataLayer<float> *)net->layers()[0].get();
}

std::vector<float> ExtractFeature(Mat FaceROI)
{
    caffe::Caffe::set_mode(caffe::Caffe::GPU);
    std::vector<Mat> test;
    std::vector<int> testLabel;
    std::vector<float> test_vector;
    test.push_back(FaceROI);
    testLabel.push_back(0);
    memory_layer->AddMatVector(test, testLabel);// memory_layer and net , must be define be a global variable.
    test.clear(); testLabel.clear();
    std::vector<caffe::Blob<float>*> input_vec;
    net->Forward(input_vec);
    boost::shared_ptr<caffe::Blob<float>> fc8 = net->blob_by_name("fc8");
    int test_num = 0;
    while (test_num < 2622)
    {
        test_vector.push_back(fc8->data_at(0, test_num++, 1, 1));
    }
    return test_vector;
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

=============注意上面这个地方可以这么改：==============
（直接可以知道这个向量的首地址、尾地址，我们直接用其来定义vector）

        float* begin = nullptr;
        float* end = nullptr;
        begin = fc8->mutable_cpu_data();
        end = begin + fc8->channels();
        CHECK(begin != nullptr);
        CHECK(end != nullptr);
        std::vector<float> FaceVector{ begin,end };
        return std::move(FaceVector);
 
 1
2
3
4
5
6
7
8

请特别注意这个地方：

namespace caffe
{
    extern INSTANTIATE_CLASS(InputLayer);
    extern INSTANTIATE_CLASS(InnerProductLayer);
    extern INSTANTIATE_CLASS(DropoutLayer);
    extern INSTANTIATE_CLASS(ConvolutionLayer);
    REGISTER_LAYER_CLASS(Convolution);
    extern INSTANTIATE_CLASS(ReLULayer);
    REGISTER_LAYER_CLASS(ReLU);
    extern INSTANTIATE_CLASS(PoolingLayer);
    REGISTER_LAYER_CLASS(Pooling);
    extern INSTANTIATE_CLASS(LRNLayer);
    REGISTER_LAYER_CLASS(LRN);
    extern INSTANTIATE_CLASS(SoftmaxLayer);
    REGISTER_LAYER_CLASS(Softmax);
    extern INSTANTIATE_CLASS(MemoryDataLayer);
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

为什么要加这些？因为在提取过程中发现，如果不加，会导致有一些层没有注册的情况。我在Github的Microsoft/Caffe上帮一外国哥们解决了这个问题。我把问题展现一下：
这里写图片描述
如果我们加了上述代码，就相当于注册了这些层，自然就不会有这样的问题。
在提取过程中，我提取的是fc8层的特征，2622维。当然，最后一层都已经是分类特征了，最好还是提取fc7层的4096维特征。
在这个地方：

void Caffe_Predefine()//when our code begining run must add it
{
    caffe::Caffe::set_mode(caffe::Caffe::GPU);
    net = Net_Init_Load<float>("vgg_extract_feature_memorydata.prototxt", "VGG_FACE.caffemodel", caffe::TEST);
    memory_layer = (caffe::MemoryDataLayer<float> *)net->layers()[0].get();
}
 
 1
2
3
4
5
6

是一个初始化的函数，用于将VGG网络模型与提取特征的配置文件进行传入，所以很明显地，在提取特征之前，需要先：

Caffe_Predefine()；
 
 1

进行了这个之后，这些全局量我们就能一直用了。
我们可以试试提取特征的这个接口。新建一个main.cpp，调用之：

#include <ExtractFeature_.h>
int main()
{
    Caffe_Predefine();
    Mat lena = imread("lena.jpg");
    if (!lena.empty())
    {
        ExtractFeature(lena);
    }

}
 
 1
2
3
4
5
6
7
8
9
10
11

因为我们得到的是一个vector< float>类型，所以我们可以把它逐一输出出来看看。当然，在ExtractFeature()的函数中你就可以这么做了。我们还是在main()函数里这么做。
来看看：

#include <ExtractFeature_.h>
int main()
{
    Caffe_Predefine();
    Mat lena = imread("lena.jpg");
    if (!lena.empty())
    {
        int i = 0;
        vector<float> print=ExtractFeature(lena);
        while (i<print.size())
        {
            cout << print[i++] << endl;
        }
    }
    imshow("Extract feature",lena);
    waitKey(0);
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

那么对于这张图片，提取出的特征，就是很多的这些数字：
这里写图片描述
提取一张224*224图片特征的时间为：0.019s。我们可以看到，GPU加速的效果是非常明显的。而且我这块显卡也就是GTX980。不知道泰坦X的提取速度如何（泪）。

附：net结构（prototxt），注意layer和layers的区别：

name: "VGG_FACE_16_layer"
layer {
  name: "data"
  type: "MemoryData"
  top: "data"
  top: "label"
  transform_param {
    mirror: false
    crop_size: 224
    mean_value:129.1863
    mean_value:104.7624
    mean_value:93.5940
  }
  memory_data_param {
    batch_size: 1
    channels:3
    height:224
    width:224
  }
}
layer {
  bottom: "data"
  top: "conv1_1"
  name: "conv1_1"
  type: "Convolution"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv1_1"
  top: "conv1_1"
  name: "relu1_1"
  type: "ReLU"
}
layer {
  bottom: "conv1_1"
  top: "conv1_2"
  name: "conv1_2"
  type: "Convolution"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv1_2"
  top: "conv1_2"
  name: "relu1_2"
  type: "ReLU"
}
layer {
  bottom: "conv1_2"
  top: "pool1"
  name: "pool1"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool1"
  top: "conv2_1"
  name: "conv2_1"
  type: "Convolution"
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv2_1"
  top: "conv2_1"
  name: "relu2_1"
  type: "ReLU"
}
layer {
  bottom: "conv2_1"
  top: "conv2_2"
  name: "conv2_2"
  type: "Convolution"
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv2_2"
  top: "conv2_2"
  name: "relu2_2"
  type: "ReLU"
}
layer {
  bottom: "conv2_2"
  top: "pool2"
  name: "pool2"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool2"
  top: "conv3_1"
  name: "conv3_1"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv3_1"
  top: "conv3_1"
  name: "relu3_1"
  type: "ReLU"
}
layer {
  bottom: "conv3_1"
  top: "conv3_2"
  name: "conv3_2"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv3_2"
  top: "conv3_2"
  name: "relu3_2"
  type: "ReLU"
}
layer {
  bottom: "conv3_2"
  top: "conv3_3"
  name: "conv3_3"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv3_3"
  top: "conv3_3"
  name: "relu3_3"
  type: "ReLU"
}
layer {
  bottom: "conv3_3"
  top: "pool3"
  name: "pool3"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool3"
  top: "conv4_1"
  name: "conv4_1"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv4_1"
  top: "conv4_1"
  name: "relu4_1"
  type: "ReLU"
}
layer {
  bottom: "conv4_1"
  top: "conv4_2"
  name: "conv4_2"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv4_2"
  top: "conv4_2"
  name: "relu4_2"
  type: "ReLU"
}
layer {
  bottom: "conv4_2"
  top: "conv4_3"
  name: "conv4_3"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv4_3"
  top: "conv4_3"
  name: "relu4_3"
  type: "ReLU"
}
layer {
  bottom: "conv4_3"
  top: "pool4"
  name: "pool4"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool4"
  top: "conv5_1"
  name: "conv5_1"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv5_1"
  top: "conv5_1"
  name: "relu5_1"
  type: "ReLU"
}
layer {
  bottom: "conv5_1"
  top: "conv5_2"
  name: "conv5_2"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv5_2"
  top: "conv5_2"
  name: "relu5_2"
  type: "ReLU"
}
layer {
  bottom: "conv5_2"
  top: "conv5_3"
  name: "conv5_3"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
  }
}
layer {
  bottom: "conv5_3"
  top: "conv5_3"
  name: "relu5_3"
  type: "ReLU"
}
layer {
  bottom: "conv5_3"
  top: "pool5"
  name: "pool5"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool5"
  top: "fc6"
  name: "fc6"
  type: "InnerProduct"
  inner_product_param {
    num_output: 4096
  }
}
layer {
  bottom: "fc6"
  top: "fc6"
  name: "relu6"
  type: "ReLU"
}
layer {
  bottom: "fc6"
  top: "fc6"
  name: "drop6"
  type: "Dropout"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  bottom: "fc6"
  top: "fc7"
  name: "fc7"
  type: "InnerProduct"
  inner_product_param {
    num_output: 4096
  }
}
layer {
  bottom: "fc7"
  top: "fc7"
  name: "relu7"
  type: "ReLU"
}
layer {
  bottom: "fc7"
  top: "fc7"
  name: "drop7"
  type: "Dropout"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  bottom: "fc7"
  top: "fc8"
  name: "fc8"
  type: "InnerProduct"
  inner_product_param {
    num_output: 2622
  }
}
layer {
  bottom: "fc8"
  top: "prob"
  name: "prob"
  type: "Softmax"
}

 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360