搭建好caffe环境后,运行Mnist数据集并进行测试
一、训练并测试模型
1、下载MNIST数据库并解压缩
进入caffe根目录,运行:
./data/mnist/get_mnist.sh
2、将其转换成Lmdb数据库格式
./examples/mnist/create_mnist.sh
3、训练网络
./examples/mnist/train_lenet.sh
可以看到训练过程及最后结果:accuracy = 0.9915 loss = 0.0284793
I0513 14:15:09.155326 7387 solver.cpp:239] Iteration 9800 (440.818 iter/s, 0.226851s/100 iters), loss = 0.0112586
I0513 14:15:09.155359 7387 solver.cpp:258] Train net output #0: loss = 0.0112588 (* 1 = 0.0112588 loss)
I0513 14:15:09.155383 7387 sgd_solver.cpp:112] Iteration 9800, lr = 0.00599102
I0513 14:15:09.384090 7387 solver.cpp:239] Iteration 9900 (437.211 iter/s, 0.228723s/100 iters), loss = 0.00521673
I0513 14:15:09.384140 7387 solver.cpp:258] Train net output #0: loss = 0.00521693 (* 1 = 0.00521693 loss)
I0513 14:15:09.384147 7387 sgd_solver.cpp:112] Iteration 9900, lr = 0.00596843
I0513 14:15:09.610247 7387 solver.cpp:468] Snapshotting to binary proto file examples/mnist/model/lenet_iter_10000.caffemodel
I0513 14:15:09.615514 7387 sgd_solver.cpp:280] Snapshotting solver state to binary proto file examples/mnist/model/lenet_iter_10000.solverstate
I0513 14:15:09.618412 7387 solver.cpp:331] Iteration 10000, loss = 0.00303709
I0513 14:15:09.618430 7387 solver.cpp:351] Iteration 10000, Testing net (#0)
I0513 14:15:09.695632 7394 data_layer.cpp:73] Restarting data prefetching from start.
I0513 14:15:09.697692 7387 solver.cpp:418] Test net output #0: accuracy = 0.9915
I0513 14:15:09.697715 7387 solver.cpp:418] Test net output #1: loss = 0.0284793 (* 1 = 0.0284793 loss)
I0513 14:15:09.697739 7387 solver.cpp:336] Optimization Done.
I0513 14:15:09.697743 7387 caffe.cpp:250] Optimization Done.
此时example/mnist目录结构如下:
:~/caffe/examples/mnist$ tree ./
./
├── convert_mnist_data.cpp
├── create_mnist.sh
├── lenet_adadelta_solver.prototxt
├── lenet_auto_solver.prototxt
├── lenet_consolidated_solver.prototxt
├── lenet_iter_10000.caffemodel
├── lenet_iter_10000.solverstate
├── lenet_iter_5000.caffemodel
├── lenet_iter_5000.solverstate
├── lenet_multistep_solver.prototxt
├── lenet.prototxt
├── lenet_solver_adam.prototxt
├── lenet_solver.prototxt
├── lenet_solver_rmsprop.prototxt
├── lenet_train_test.jpeg
├── lenet_train_test.prototxt
├── mnist_autoencoder.prototxt
├── mnist_autoencoder_solver_adadelta.prototxt
├── mnist_autoencoder_solver_adagrad.prototxt
├── mnist_autoencoder_solver_nesterov.prototxt
├── mnist_autoencoder_solver.prototxt
├── mnist_test_lmdb
│ ├── data.mdb
│ └── lock.mdb
├── mnist_train_lmdb
│ ├── data.mdb
│ └── lock.mdb
├── model
│ ├── lenet_iter_10000.caffemodel
│ ├── lenet_iter_10000.solverstate
│ ├── lenet_iter_5000.caffemodel
│ └── lenet_iter_5000.solverstate
├── readme.md
├── train_lenet_adam.sh
├── train_lenet_consolidated.sh
├── train_lenet_docker.sh
├── train_lenet_rmsprop.sh
├── train_lenet.sh
├── train_mnist_autoencoder_adadelta.sh
├── train_mnist_autoencoder_adagrad.sh
├── train_mnist_autoencoder_nesterov.sh
└── train_mnist_autoencoder.sh
3 directories, 39 files
:~/caffe/examples/mnist$
4、使用测试数据集来验证
:~/caffe$ ./build/tools/caffe.bin test -model=examples/mnist/lenet_train_test.prototxt -weights=examples/mnist/model/lenet_iter_10000.caffemodel -gpu=0
I0513 14:23:48.456948 9312 caffe.cpp:266] Use GPU with device ID 0
I0513 14:23:48.470187 9312 caffe.cpp:270] GPU device name: GeForce GTX 1060
I0513 14:23:48.722349 9312 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist
测试结果:accuracy = 0.9874
I0513 14:23:48.984410 9312 caffe.cpp:304] Batch 48, accuracy = 0.98
I0513 14:23:48.984421 9312 caffe.cpp:304] Batch 48, loss = 0.0415336
I0513 14:23:48.985477 9312 caffe.cpp:304] Batch 49, accuracy = 1
I0513 14:23:48.985488 9312 caffe.cpp:304] Batch 49, loss = 0.004546
I0513 14:23:48.985494 9312 caffe.cpp:309] Loss: 0.0429299
I0513 14:23:48.985507 9312 caffe.cpp:321] accuracy = 0.9874
I0513 14:23:48.985515 9312 caffe.cpp:321] loss = 0.0429299 (* 1 = 0.0429299 loss)
二、关于训练及测试中使用的文件详解
1、mnist数据集
关于mnist的一个资料网址:http://yann.lecun.com/exdb/mnist/
如果想要查看数据集中的内容,可以通过以下代码:
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import matplotlib.pyplot as plt
import os
import struct
import numpy as np
def load_mnist(path, kind='train'):
#load MNIST data from 'path
labels_path = os.path.join(path,
'%s-labels.idx1-ubyte' % kind)
images_path = os.path.join(path,
'%s-images.idx3-ubyte' % kind)
#数组labels包含了相应的目标变量,也就是手写数字的类标签(整数0-9)
with open(labels_path, 'rb') as lbpath:
magic, n = struct.unpack('>II', lbpath.read(8))
labels = np.fromfile(lbpath, dtype=np.uint8)
#images是一个n×m维的NumPy array,n是样本数(行数),m是特征数(列数)
with open(images_path, 'rb') as imgpath:
magic, num, rows, cols = struct.unpack('>IIII', imgpath.read(16))
images = np.fromfile(imgpath,
dtype=np.uint8).reshape(len(labels), 784)
return images, labels
def get_labelimage():
images, labels = load_mnist('', kind='train')
fig, ax = plt.subplots(
nrows=2,
ncols=5,
sharex=True,
sharey=True)
ax = ax.flatten()
for i in range(10):
img = images[labels == i][0].reshape(28, 28)
ax[i].imshow(img, cmap='Greys', interpolation='nearest')
ax[0].set_xticks([])
ax[0].set_yticks([])
plt.tight_layout()
plt.show()
def get_trainimage():
images, labels = load_mnist('', kind='train')
fig, ax1 = plt.subplots(
nrows=5,
ncols=5,
sharex=True,
sharey=True)
ax1= ax1.flatten()
for i in range(25):
img = images[labels == 7][i].reshape(28, 28)
ax1[i].imshow(img, cmap='Greys', interpolation='nearest')
ax1[0].set_xticks([])
ax1[0].set_yticks([])
plt.tight_layout()
plt.show()
if __name__ == '__main__':
get_labelimage()
get_trainimage()
2、训练模型关注两个文件caffe/examples/mnist/lenet_solver.prototxt 和examples/mnist/lenet_train_test.prototxt
:~/CaptureImage/mnist$ cat ../../caffe/examples/mnist/lenet_solver.prototxt
# The train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/model/lenet"
# solver mode: CPU or GPU
solver_mode: GPU
:~/CaptureImage/mnist$
examples/mnist/lenet_train_test.prototxt(神经网络模型描述文件)文件内容如下:
name: "LeNet" //网络名称:LeNet
//模型层1:Data层(数据层)图片大小为28*28
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label" //数据层输出的两个blob:data和label,见上图
include {
phase: TRAIN //该层只在训练阶段有效
}
transform_param { //数据预处理,参数转换
scale: 0.00390625 //归一化特征处理:将[0,255]的mnist数据归一化为[0,1] 0.00390625=1/256
}
data_param {
source: "examples/mnist/mnist_train_lmdb" //数据库LMDB的路径,用于训练的数据库
batch_size: 64 //批量数目,表示caffe一次从数据库LMDB读入的图片的数量
backend: LMDB //区别于LevelDB数据库
}
}
//模型层2:Data层(数据层)
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST //该层只在测试阶段有效
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_test_lmdb"
batch_size: 100
backend: LMDB
}
}
//模型层3:Convolution层(第一个卷积层)
//Convolution层,使用一系列可训练的卷积核(相当于空间滤波的滤波算子)对输入图像进行卷积操作,
//每组卷积核生成输出图像中的一个特征图(相当于对输入图像,使用20个不同的滤波算子(卷积)进行20
//次卷积之后生成的20张经过滤波的特征图)
//输出图片大小:(28+2*0-5)/1+1=(img_h+2*pad_h-kernel_h)/stride_h+1======24*24
layer {
name: "conv1"
type: "Convolution"
bottom: "data" //卷积层的输入是data
top: "conv1" //卷积层的输出是conv1
param { //学习率参数调整
lr_mult: 1 //权值学习速率倍乘因子,1表示,保持与全局参数(solver)一致
}
param {
lr_mult: 2 //偏置项的学习速率倍乘因子,是全局参数(solver)的2倍
}
convolution_param { //卷积层计算参数
num_output: 20 //输出的feature map的数量为20,对应的卷积核数量为20
kernel_size: 5 //卷积核的尺寸为5*5
stride: 1 //卷积核在输入图片上滑动的步长为1
weight_filler {
type: "xavier" //用xavier算法初始化权重,根据输入和输出神经元的个数自动初始化weights
}
bias_filler {
type: "constant" //偏执项的初始化方案为:constant,默认为0
}
}
}
//模型层4:Pooling层(第一个池化层)
//保持不变性、减少下一层的输入参数的个数
//输出图片大小为12*12
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1" //池化层的输入是conv1
top: "pool1" //池化层的输出是pool1
pooling_param { //池化层的参数
pool: MAX //目前提供了三种池化的方法:最大值池化,均值池化,随机池化
kernel_size: 2 //指定池化窗口的宽度和高度:2*2
stride: 2 //指定池化窗口在输入数据上滑动的步长为:2
}
}
//模型层5:Convolution层(第二个卷积层)
//该卷积层输出的feature map(特征图的数量)为:50
//输出图片的大小为:(12-2*0-5)/1+1=======8*8
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
//模型层6:Pooling层(第二个池化层)
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
//模型层7:InnerProduct层(第一个全连接层)
//全连接层的的输出节点数(num_output==500)可以理解为滤波器的个数(滤波算子的个数),对应的也
//就是输出特征图的个数
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
//模型层8:ReLU层(非线性层)(激活函数)(规整化线性单元),此激活层采用的激活函数为:RELU
//该激活层的输入blob为iP1,输出blob也为iP1
//该(规整化线性单元)激活层的作用为:对全连接层的每一个输出数据进行判断,当x>0时,RELU的输出为x,
//根据X的大小,说明这个单元的激活程度(兴奋程度);如果x<=0,则这个信号(特征图)被完全抑制
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
//模型层9:InnerProduct层(第二个全连接层)
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10 //该层的输出为10个特征,对应0--9这10类数字
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
//模型层10:Accuracy层(分类准确率层)
//Accuracy层的作用:该层用来计算网络输出相对于目标值的准确率
//该层的输入blob为iP2和label,输出blob为accuracy
//该层只在Test(测试)阶段有效,并且,它并不是一个Loss层,所以没有BP操作
//BP算法对深度学习发展有着重要意义,详细可参考:
//https://blog.csdn.net/earl211/article/details/52300480
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
//模型层11:Loss层(损失层)
//层类型:SoftnaxWithLoss---softmax损失层一般用于计算[多分类问题]的损失,
//在概念上等同于softmax层后面跟一个多变量的logistic回归损失层,但能提供更稳定的梯度
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}