上篇博客已经利用caffe针对MNIST数据集训练出了lenet_iter_10000.caffemodel, 因为要将此模型应用到自己的实际任务中。所以本篇博客记录利用训练好的lenet_iter_10000.cafemodel来测试自己的手写字符图片。
准备待测试图片
本篇博客选取的测试图片是大神符生成程序自带的字符图片(2017年Robomasters机器人大赛),为了之后测试方便,将图片统一命名为1.png, 2.png, … , N.png。在此选取字符5所属文件夹中的第一张图片作为测试图片,如下所示:
deploy.prototxt模型描述文件
deploy.prototxt与lent_train_test.prototxt类似,可通过改写后者来实现,具体如下:
name: "LeNet"
input: "data"
input_dim:1
input_dim:1
input_dim:28
input_dim:28
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}
第二个全连接层的num_output: 10代表分类数目是10。
在最开始的时候数据层是这样写的:
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape : {dim: 1 dim: 1 dim: 28 dim: 28 }}
}
运行之后出现如下所示错误:
改成上面的写法之后才能正确运行,可能是caffe版本的兼容性问题,具体原因目前也不清楚啦。
lenet_iter_10000.caffemodel模型权值文件
训练过程中生成的模型文件,到相应的文件夹中可以找到。
synset_words.txt标签文件
注意在9之后不要加回车,否则会出现如下所示错误:
mean.binaryproto二进制均值文件
因为caffe自带的例程训练MNIST模型的过程中没有用到二进制均值文件,所以测试的过程中也不需要此文件,所以本篇博客不就这方面展开陈述。
classification.bin分类器
在examples/cpp_classification/classification.cpp,将其改写成自己所需的cpp文件,由于未使用二进制均值文件,所以需要将涉及到均值文件的部分进行改写,然后编译链接,生成classification.bin, 这就是我们所需的分类器。
include <caffe/caffe.hpp>
#define USE_OPENCV
#ifdef USE_OPENCV
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#endif
#include <algorithm>
#include <iosfwd>
#include <memory>
#include <string>
#include <utility>
#include <vector>
#ifdef USE_OPENCV
using namespace caffe;
using std::string;
typedef std::pair<string, float> Prediction;
class Classifier {
public:
Classifier(const string& model_file, const string& trained_file, const string& mean_file, const string& label_file);
std::vector<Prediction> Classify(const cv::Mat& img, int N=5);
private:
std::vector<float> Predict(const cv::Mat& img);
void WrapInputLayer(std::vector<cv::Mat>* input_channels);
void Preprocess(const cv::Mat& img, std::vector<cv::Mat>* input_channels);
private:
shared_ptr<Net<float> > net_;
cv::Size input_geometry_;
int num_channels_;
std::vector<string> labels_;
}
Classifier::Classifier(const string& model_file, const string& trained_file, const string& mean_file, const string& label_file){
#ifdef CPU_ONLY
Caffe::set_mode(Caffe::CPU);
#else
Caffe::set_mode(Caffe::GPU);
#endif
net_.reset(new Net<float>(model_file, TEST));
net_->CopyTrainedLayersFrom(trained_file);
CHECK_EQ(net_->num_inputs(),1)<<"Network should have exactly one input.";
CHECK_EQ(net_->num_outputs(),1)<<"Network should have exactly one output.";
Blob<float>* input_layer=net_->input_blobs()[0];
num_channels_=input_layer->channels();
CHECK(num_channels_==3 || num_channels_==1)<<"Input layer should have 1 or 3 channels.";
input_geometry_=cv::Size(28,28);
std::ifstream labels(label_file.c_str());
CHECK(labels)<<"Unable to open labels file"<<label_file;
string line;
while(std::getline(labels,line))
labels_.push_back(string(line));
Blob<float>* output_layer=net_->output_blobs()[0];
CHECK_EQ(labels_.size(), output_layer->channels())<<"Number of labels is different from the output layer dimension.";
}
static bool PairCompare(const std::pair<float, int>& lhs, const std::pair<float, int>& rhs)
{
return lhs.first>rhs.first;
}
static std::vector<int> Argmax(const std::vector<float>&v, int N){
std::vector<std::pair<float, int> > pairs;
for(size_t i=0; i<v.size(); i++)
pairs.push_back(std::make_pair(v[i],i));
std::partial_sort(pairs.begin(), pairs.begin+N, pairs.end, PairCompare);
std::vector<int> result;
for(int i=0; i<N; i++)
result.push_back(pairs[i].second);
return result;
}
std::vector<Prediction> Classifier::Classify(const cv::Mat& img, int N){
std::vector<float> output=Predict(img);
N=std::min<int>(labels_.size(),N);
std::vector<int> maxN=Argmax(output,N);
std::vector<Prediction> predictions;
for(int i=0; i<N; i++){
int idx=maxN[i];
predictions.push_back(std::make_pair(labels_[idx],output[idx]));
}
return predictions;
}
std::vector<float> Classifier::Predict(const cv::Mat& img){
Blob<float>* input_layer=net_->input_blobs()[0];
input_layer->Reshape(1, num_channels_, input_geometry_.height, input_geometry_.width);
net_->Reshape();
std::vector<cv::Mat> input_channels;
WrapInputLayer(&input_channels);
Preprocess(img, &input_channels);
net_->ForwardPrefilled();
Blob<float>* output_layer=net_->output_blobs()[0];
const float* begin=output_layer->cpu_data();
const float* end=begin+output_layer->channels();
return std::vector<float>(begin, end);
}
void Classifier::WrapInputLayer(std::vector<cv::Mat>* input_channels){
Blob<float>* input_layer=net_->input_blobs()[0];
int width=input_layer->width();
int height=input_layer->height();
float* input_data=input_layer->mutable_cpu_data();
for(int i=0; i<input_layer->channels(); i++){
cv::Mat channels(height, widht, CV_32FC1, input_data);
input_channels->push_back(channel);
input_data+=width*height;
}
}
void Classifier::Preprocess(const cv::Mat& img, std::vector<cv::Mat>* input_channels){
cv::Mat sample;
if(img.channels()==3 && num_channles==1)
cv::cvtColor(img, sample, cv::COLOR_BGR2GRAY);
else if(img.channels()==4 && num_channels==1)
cv::cvtColor(img, sample, cv::COLOR_BGRA2GRAY);
else if(img.channels()==4 && num_channels==3)
cv::cvtColor(img, sample, cv::COLOR_BGRA2BGR);
else if(img.channels()==1 && num_channels==3)
cv::cvtColor(img, sample, cv::COLOR_GRAY2BGR);
else
sample=img;
cv::Mat sample_resized;
if(sample.size()!=input_geometry_)
cv::resize(sample, sample_resized, input_geometry_);
else
sample_resized=sample;
cv::Mat sample_float;
if(num_channels==3)
sample_resized.convertTo(sample_float, CV_32FC3);
else
sample_resized.convertTo(sample_float, CV32FC1);
cv::split(sample_float, *input_channels);
CHECK(reinterpret_cast<float*>(input_channels->at(0).data)==net_->input_blobs()[0]->cpu_data())<<"Input channels not wrapping the input layer of the network.";
}
int main(int argc, char** argv){
if(argc!=5){
std::cerr<<"Usage:"<<argv[0]<<"deploy.prototxt network.caffemodel"<<"labels.txt img.jpg"<<std::endl;
return 1;
}
::google::InitGoogleLogging(argv[0]);
string model_file=argv[1];
string trained_file=argv[2];
string mean_file="";
string label_file=argv[3];
Classifier classifier(model_file, trained_file, mean_file, label_file);
string file=argv[4];
std::cout<<"--------------------Prediction for"<<file<<"-----------------"<<std::endl;
cv::Mat img=cv::imread(file,-1);
CHECK(!img.empty())<<"Unable to decode image"<<file;
std::vector<Prediction> predictions=classifier.Classify(img);
for(size_t i=0; i<predictions.size(); i++){
Prediction p=predictions[i];
std::cout<<std::fixed<<std::setprecision(4)<<p.second<<"-\""<<p.first<<"\""<<std::endl;
}
}
#else
int main(int argc, char** argv){
LOG(FATAL)<<"this example requires opencv; compile with USE_OPENCV.";
}
#endif
测试
在服务器集群上利用CPU进行测试(GPU被别人用啦),注意改写一下Makefile.config配置文件,将CPU_ONLY :=1 前面的注释符#去掉,USE_CUNDD :=1前面加上注释符。 指令如下:
$ srun -p K15G12 -J MNIST -c 4 /lustre1/hw/yingjia/caffe-test/build/examples/cpp_classification/classification.bin /lustre1/hw/yingjia/caffe-test/examples/mnist/deploy.prototxt /lustre1/hw/yingjia/caffe-test/examples/mnist/lenet_iter_10000.caffemodel /lustre1/hw/yingjia/caffe-test/examples/mnist/synset_words.txt /lustre1/hw/yingjia/caffe-test/examples/mnist/5/1.png
运行结果如下:
涉及到的其他知识
1、因为最终要对测试集中所有图片进行测试,所以编写脚本piliang.sh实现批量测试,如下:
#!/bin/bash
echo "this script is test model"
for numberFile in /lustre1/hw/yingjia/caffe-test/examples/mnist/image/5/*
do
srun -p K15G12 -J MNIST -c 4 /lustre1/hw/yingjia/caffe-test/build/examples/cpp_classification/classification.bin /lustre1/hw/yingjia/caffe-test/examples/mnist/deploy.prototxt /lustre1/hw/yingjia/caffe-test/examples/mnist/lenet_iter_10000.caffemodel /lustre1/hw/yingjia/caffe-test/examples/mnist/synset_words.txt $numberFile
done
2、测试完成之后需要对结果进行统计,在classification.cpp文件中当测试结果错误时输出”error”, 所以可以将输出重定向到一个.log文件中,然后统计文件中”error”字符的个数。
重定向的指令如下:
$ ./piliang.sh >> stderr.log 2>&1
统计stderr.log文件中”error”字符个数的指令如下:
$ awk -v RS='error' 'END {print --NR}' stderr.log
3、将Prediction p1=predictions[0]字符型转换成整形tempt:
std::stringstream sstr;
sstr<<p1.first;
int tempt;
sstr>>tempt;
4、整形转换成字符型:
int i=1;
std::stringstream sstr1;
sstr1<<i;
string str1;
sstr1>>str1;