前面已经介绍了使用使用mnist数据集进行训练lenet,并使用opencv加在caffemodel进行预测。更进一步也是最终的目的,还是要学会使用自己的数据集训练caffemodel并进行预测。这里先以训练lenet为例进行说明。
1、数据格式的转换(images to lmdb)
通过前面几篇介绍,我们已经知道,可直接用于caffe训练的数据格式有lmdb和leveldb两种,但lmdb的效率更高,因此这里需要先将原始图像数据转换为lmdb。
1.1 在工作目录细新建一个文件夹,这里命名为images_to_lmdb,图像格式的转换将在这个文件夹下进行。将分好类的训练数据(这里包括前景(车辆)和背景数据,二者合适的比例为1:3,一部分用于训练,一部分测试)也拷贝到这个文件夹下,分别创建train.txt和test.txt两个文本文件,指定源图像的路径(如果不是剪切好的图像,需要指定目标所在的区域)和类别(这里前景设为0, 背景设为1),如下图所示。
1.2 在当前目录下新建两个文本文件,分别重命名为images_to_train_lmdb.bat 和 images_to_test_lmdb.bat.分别输入以下内容(注意路径输入正确):
D:\Libraries\caffe\msvc2013_64\bin\convert_imageset.exe --shuffle --gray --resize_width=28 --resize_height=28 images\ train.txt train_lmdb -backend=lmdb
pause
D:\Libraries\caffe\msvc2013_64\bin\convert_imageset.exe --shuffle --gray --resize_width=28 --resize_height=28 images\ test.txt test_lmdb -backend=lmdb
pause
1.3 双击images_to_train_lmdb.bat执行,可以得到训练数据,保存在文件夹train_lmdb中。执行过程如下:
同样,双击images_to_test_lmdb.bat可得到训练数据。
2、计算均值
在当前文件夹下新建文本文件,并命名为computer_image_mean.bat,并输入如下内容:
D:\Libraries\caffe\msvc2013_64\bin\compute_image_mean.exe train_lmdb mean.binaryproto --backend=lmdb
pause
保存,双击执行,得到文件mean.binaryproto。
至此,训练数据准备完毕,下面进行训练。
3、设置训练参数
3.1 在工程目录下新建文件夹,并命名为glnet_train_test,训练工作将在这个文件夹下进行,将上两步得到的训练数据,即两个文件夹train_lmdb和test_lmdb,以文件mean.binaryproto拷贝到当前文件夹下。
3.2 新建两个文本文件,分别命名为lenet_solver.prototxt和lenet_train_test.prototxt。
向lenet_solver.prototxt输入如下内容,并保存。
# The train/test net protocol buffer definition
net: "lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "lenet"
# solver mode: CPU or GPU
solver_mode: GPU
向lenet_train_test.prototxt输入如下内容,并保存
name: "LeNet"
layer {
name: "myNet"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
#mean_file:"mean.binaryproto"
scale: 0.00390625
}
data_param {
source: "train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "myNet"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
#mean_file:"mean.binaryproto"
}
data_param {
source: "test_lmdb"
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 2 ####根据训练的目标种类进行设定,这里设置为2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
参数设置完毕,下面进行训练和测试。
4、训练和测试
4.1 在当前文件夹下新建文本文件,命名为train.bat,输入如下内容,并保存。
D:\Libraries\caffe\msvc2013_64\bin\caffe.exe train --solver=lenet_solver.prototxt
pause
双击执行,过程如下:
最终得到如下训练模型:
4.2 下面进行测试,新建文本文件并命名为test.bat,并输入如下内容并保存。
D:\Libraries\caffe\msvc2013_64\bin\caffe.exe test --model lenet_train_test.prototxt -weights=lenet_iter_10000.caffemodel
pause
双击执行,测试结果如下:
5、使用opencv加在caffemodel
5.1上面已经得到了caffemodel,在使用opencv进行预测前,需要先新建一个文本文件并命名为lenet.prototxt,输入如下内容,并保存。
name: "LeNet"
input: "data"
input_dim: 64 #训练的bacth_size
input_dim: 1 #通道数
input_dim: 28 #长
input_dim: 28 #宽
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 2 #和lenet_train_test.prototxt中保持一致。
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}
5.2 使用opencv加在模型进行预测(具体参考 OpenCV Load caffe model)
这里直接附上代码:
#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <fstream>
#include <iostream>
#include <cstdlib>
/* Find best class for the blob (i. e. class with maximal probability) */
void getMaxClass(cv::dnn::Blob &probBlob, int *classId, double *classProb)
{
cv::Mat probMat = probBlob.matRefConst().reshape(1, 1); //reshape the blob to 1x1000 matrix
cv::Point classNumber;
cv::minMaxLoc(probMat, NULL, classProb, NULL, &classNumber);
*classId = classNumber.x;
}
std::vector<cv::String> readClassNames(const char *filename = "label.txt")
{
std::vector<cv::String> classNames;
std::ifstream fp(filename);
if (!fp.is_open())
{
std::cerr << "File with classes labels not found: " << filename << std::endl;
exit(-1);
}
std::string name;
while (!fp.eof())
{
std::getline(fp, name);
if (name.length())
classNames.push_back(name.substr(name.find(' ') + 1));
}
fp.close();
return classNames;
}
int main(int argc, char **argv)
{
void cv::dnn::initModule();
cv::String modelTxt = "lenet.prototxt";
cv::String modelBin = "lenet_iter_10000.caffemodel";
cv::String imageFile = "0001.png";
cv::dnn::Net net = cv::dnn::readNetFromCaffe(modelTxt, modelBin);
if (net.empty())
{
std::cerr << "Can't load network by using the following files: " << std::endl;
std::cerr << "prototxt: " << modelTxt << std::endl;
std::cerr << "caffemodel: " << modelBin << std::endl;
exit(-1);
}
//! [Prepare blob]
cv::Mat img = cv::imread(imageFile, cv::IMREAD_GRAYSCALE);
if (img.empty())
{
std::cerr << "Can't read image from the file: " << imageFile << std::endl;
exit(-1);
}
cv::resize(img, img, cv::Size(28, 28));
//cv::dnn::Blob inputBlob = cv::dnn::Blob(img); //Convert Mat to dnn::Blob image batch
cv::dnn::Blob inputBlob = cv::dnn::Blob::fromImages(img);
//! [Prepare blob]
//! [Set input blob]
net.setBlob(".data", inputBlob); //set the network input
//! [Set input blob]
//! [Make forward pass]
net.forward(); //compute output
//! [Make forward pass]
//! [Gather output]
cv::dnn::Blob prob = net.getBlob("prob"); //gather output of "prob" layer
int classId;
double classProb;
getMaxClass(prob, &classId, &classProb);//find the best class
//! [Gather output]
//! [Print results]
std::vector<cv::String> classNames = readClassNames();
std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
//! [Print results]
return 0;
} //main
测试结果如下:
DONE!
2017.08.01