Caffe 实例 手写数字mnist训练与测试过程(Windows + CPU Only)

1、原英文网址TrainingLeNet on MNIST with Caffe


基本环境 Windows下CPU运行


2、 程序准备

2.1 Caffe准备

从https://github.com/BVLC/caffe/tree/windows下载Prebuilt Release和源代码,解压后将Prebuilt Release中bin目录下的文件复制到下载的Caffe源程序Source目录下(为了后面执行方便,也可以不复制了)。

需要执行Caffe.exe和convert_mnist_data.exe在Prebuilt Release中,而examples在源代码中,所以两个都要下载


2.2 cygwin安装

 安装cygwin(windows版本,为了执行sh文件)

3、数据准备

3.1 训练数据

http://yann.lecun.com/exdb/mnist/下载,并解压。注意解压可能会自动修改文件名,一定要手工修改文件名与以下一致起来


train-images-idx3-ubyte:  训练集样本 (9912422 bytes) 

train-labels-idx1-ubyte:  训练集对应标注 (28881 bytes) 

t10k-images-idx3-ubyte:   测试集图片 (1648877 bytes) 

t10k-labels-idx1-ubyte:   测试集对应标注 (4542 bytes)

 

3.2、create_mnist.sh 

create_mnist.sh是将下载的数据转换为lmdb格式,具体转换原因,可以从网上搜索,在执行之前需要修改examples\mnist\Create_mnist.sh,主要是设置输入输出数据目录设置。修改后内容如下

#!/usr/bin/envsh
# Thisscript converts the mnist data into lmdb/leveldb format,
#depending on the value assigned to $BACKEND.
set -e
 
EXAMPLE=examples/mnist
DATA=data/mnist
BUILD=bin   # convert_mnist_data.exe所在目录
 
BACKEND="lmdb"
 
echo"Creating ${BACKEND}..."
 
rm -rf$EXAMPLE/mnist_train_${BACKEND}
rm -rf$EXAMPLE/mnist_test_${BACKEND}
 
#bin 修改为exe(因为prebuilt是exe文件不是bin文件)
$BUILD/convert_mnist_data.exe$DATA/train-images-idx3-ubyte \
  $DATA/train-labels-idx1-ubyte$EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND}
$BUILD/convert_mnist_data.exe$DATA/t10k-images-idx3-ubyte \
  $DATA/t10k-labels-idx1-ubyte$EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND}
 
echo "Done."

3.3 生成lmdb格式数据库

  在cygwin中执行 bashexamples/mnist/create_mnist.sh生成lmdb数据库,在examples/mnist/目录下



4、数据训练

  数据训练,使用train_lenet.sh进行训练。

4.1 train_lenet.sh 

#!/usr/bin/envsh
set-e
./bin/caffetrain --solver=examples/mnist/lenet_solver.prototxt $@


这里需要先确定你的caffe.exe在什么路径位置。

 




4.2 执行train_lenet.sh 

在cygwin中执行

bash examples/mnist/create_mnist.sh

训练结果为四个文件。

lenet_iter_5000.caffemodel

lenet_iter_5000.solverstate

lenet_iter_10000.caffemodel

lenet_iter_10000.solverstate



 5、测试图片

根据训练结果可以测试现有图片是属于哪个类别。参见博客caffe训练好的lenet_iter_10000.caffemodel测试单张mnist图片

5.1 测试参数与输出文件准备

5.1.1 deploy.prototxt 文件

用训练好的caffemodel来测试单张图片需要一个deploy.prototxt文件来指定网络的模型构造。 事实上deploy.prototxt文件与lenet_train_test.prototxt文件类似,只是首尾有些差别。仿照博客http://www.cnblogs.com/denny402/p/5685818.html 中的教程用deploy.py文件来生成deploy.prototxt文件。

直接用别人生成好的也可以。注意在depoy.prototxt文件中指定正确的该图片的通道数。

可以直接使用deploy.prototxt 结果

name: "LeNet"
/*原来训练与测试两层数据层*/
/*layer {
 name: "mnist"
 type: "Data"
 top: "data"
 top: "label"
 include {
   phase: TRAIN
 }
 transform_param {
   scale: 0.00390625
 }
 data_param {
   source: "examples/mnist/mnist_train_lmdb"
   batch_size: 64
   backend: LMDB
 }
}
layer {
 name: "mnist"
 type: "Data"
 top: "data"
 top: "label"
 include {
   phase: TEST
 }
 transform_param {
   scale: 0.00390625
 }
 data_param {
   source: "examples/mnist/mnist_test_lmdb"
   batch_size: 100
   backend: LMDB
 }
}*/
 
/*被替换成如下*/
 
layer {
  name:"data"
 type: "Input"
 top: "data"
 input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }
}
 
/*卷积层与全连接层中的权值学习率,偏移值学习率,偏移值初始化方式,因为这些值在caffemodel文件中已经提供*/
layer {
  name:"conv1"
 type: "Convolution"
 bottom: "data"
 top: "conv1"
 convolution_param {
   num_output: 20
   kernel_size: 5
   stride: 1
   weight_filler {
     type: "xavier"
   }
 }
}
layer {
  name:"pool1"
 type: "Pooling"
 bottom: "conv1"
 top: "pool1"
 pooling_param {
   pool: MAX
   kernel_size: 2
   stride: 2
  }
}
layer {
  name:"conv2"
 type: "Convolution"
 bottom: "pool1"
 top: "conv2"
 convolution_param {
   num_output: 50
   kernel_size: 5
   stride: 1
   weight_filler {
     type: "xavier"
   }
 }
}
layer {
  name:"pool2"
 type: "Pooling"
 bottom: "conv2"
 top: "pool2"
 pooling_param {
   pool: MAX
   kernel_size: 2
   stride: 2
  }
}
layer {
  name:"ip1"
 type: "InnerProduct"
 bottom: "pool2"
 top: "ip1"
 inner_product_param {
   num_output: 500
   weight_filler {
     type: "xavier"
   }
 }
}
layer {
  name:"relu1"
 type: "ReLU"
 bottom: "ip1"
 top: "ip1"
}
layer {
  name:"ip2"
 type: "InnerProduct"
 bottom: "ip1"
 top: "ip2"
 inner_product_param {
   num_output: 10
   weight_filler {
     type: "xavier"
   }
 }
}
 
/*删除了原有的测试模块的测试精度层*/
 
/*输出层的类型由SoftmaxWithLoss变成Softmax,训练是输出时是loss,应用时是prob。*/
layer {
  name:"prob"
 type: "Softmax"
 bottom: "ip2"
 top: "prob"
}
 


注意在depoy.prototxt文件中指定正确的该图片的通道数。


5.1.2 准备一个均值文件

因为classify.py中的测试接口caffe.Classifier需要训练图片的均值文件作为输入参数,而实际lenet-5训练时并未计算均值文件,所以这里创建一个全0的均值文件输入。编写一个zeronp.py文件如下 ,

import numpy as np
zeros=np.zeros((28,28,1),dtype=np.float32)
np.save('meanfile.npy',zeros)  #k=channels,H=height,W=width
 


执行 python zeronp.py

生成均值文件 meanfile.npy。 这里注意宽高要与输入测试的图片宽高一致。这里参考:https://github.com/BVLC/caffe/issues/320

5.2 准备分类Python文件Classify.py

修改classify.py(在原有的文件上修改保存为classifymnist.py文件 )(修改的文件注意其中路径需要与你执行目录相对路径一致)

#!/usr/bin/envpython
"""
classify.pyis an out-of-the-box image classifer callable from the command line.
 
Bydefault it configures and runs the Caffe reference ImageNet model.
"""
import numpyas np
import os
import sys
import argparse
import glob
import time
import pandasas pd#插入数据分析包
 
import caffe
 
def main(argv):
    pycaffe_dir = os.path.dirname(__file__)
 
    parser = argparse.ArgumentParser()
    # Required arguments: input and output files.
    parser.add_argument(
        "input_file",
        help="Input image, directory, or npy."
    )
    parser.add_argument(
        "output_file",
        help="Output npy filename."
    )
    # Optional arguments.
    parser.add_argument(
        "--model_def",
        default=os.path.join(pycaffe_dir,
                "deploy.prototxt"),#指定lenet-5的deploy.prototxt模型位置
        help="Model definition file."
    )
    parser.add_argument(
        "--pretrained_model",
        default=os.path.join(pycaffe_dir,
                "lenet_iter_10000.caffemodel"),#指定lenet-5的caffemodel模型位置
        help="Trained model weights file."
    )
#######新增^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    parser.add_argument(
        "--labels_file",
        default=os.path.join(pycaffe_dir,
                "synset_words.txt"),#指定输出结果对应的类别名文件
        help="mnist result words file"
    )
    parser.add_argument(
        "--force_grayscale",
        action='store_true',  #增加一个变量将输入图像强制转化为灰度图,因为lenet-5训练用的就是灰度图
        help="Converts RGB images down to single-channelgrayscale versions," +
                   "useful for single-channel networks likeMNIST."
    )
    parser.add_argument(
        "--print_results",
        action='store_true',#输入参数要求打印输出结果
        help="Write output text to stdout rather than serializingto a file."
    )
#######新增vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    parser.add_argument(
        "--gpu",
        action='store_true',
        help="Switch for gpu computation."
    )
    parser.add_argument(
        "--center_only",
        action='store_true',
        help="Switch for prediction from center crop aloneinstead of " +
             "averaging predictions across crops (default)."
    )
    parser.add_argument(
        "--images_dim",
        default='28,28',    #指定图像高与宽
        help="Canonical 'height,width' dimensions of inputimages."
    )
    parser.add_argument(
        "--mean_file",
        default=os.path.join(pycaffe_dir,
                             'meanfile.npy'),#指定均值文件
        help="Data set image mean of [Channels x Height x Width]dimensions " +
             "(numpy array). Set to '' for no meansubtraction."
    )
    parser.add_argument(
        "--input_scale",
        type=float,
        help="Multiply input features by this scale to finishpreprocessing."
    )
    parser.add_argument(
        "--raw_scale",
        type=float,
        default=255.0,
        help="Multiply raw input by this scale beforepreprocessing."
    )
    parser.add_argument(
        "--channel_swap",
        default='2,1,0',
        help="Order to permute input channels. The defaultconverts " +
             "RGB -> BGR since BGR is the Caffe default by wayof OpenCV."
    )
    parser.add_argument(
        "--ext",
        default='jpg',
        help="Image file extension to take as input when adirectory " +
             "is given as the input file."
    )
    args = parser.parse_args()
 
    image_dims = [int(s) for sin args.images_dim.split(',')]
 
    mean, channel_swap = None,None
    ifnot args.force_grayscale:
        ifargs.mean_file:
            mean =np.load(args.mean_file).mean(1).mean(1)
        ifargs.channel_swap:
            channel_swap = [int(s) for s in args.channel_swap.split(',')]
 
    if args.gpu:
        caffe.set_mode_gpu()
        print("GPU mode")
    else:
        caffe.set_mode_cpu()
        print("CPU mode")
 
    # Make classifier.
    classifier =caffe.Classifier(args.model_def, args.pretrained_model,
            image_dims=image_dims, mean=mean,
            input_scale=args.input_scale,raw_scale=args.raw_scale,
            channel_swap=channel_swap)
 
    # Load numpy array (.npy), directory glob (*.jpg), or image file.
    args.input_file =os.path.expanduser(args.input_file)
    ifargs.input_file.endswith('npy'):
        print("Loading file: %s" % args.input_file)
        inputs = np.load(args.input_file)
    elifos.path.isdir(args.input_file):
        print("Loading folder: %s" % args.input_file)
        inputs =[caffe.io.load_image(im_f)
                 forim_fin glob.glob(args.input_file +'/*.' +args.ext)]
    else:
        print("Loading image file: %s" % args.input_file)
        inputs =[caffe.io.load_image(args.input_file,notargs.force_grayscale)]#强制图片为灰度图
 
    print("Classifying %d inputs." % len(inputs))
 
    # Classify.
    start = time.time()
    scores = classifier.predict(inputs,not args.center_only).flatten()
    print("Done in %.2f s." % (time.time() - start))
 
   #增加输出结果打印到终端^^^^^^^^
    # print
    ifargs.print_results:
        withopen(args.labels_file)as f:
            labels_df = pd.DataFrame([{'synset_id':l.strip().split(' ')[0],'name':' '.join(l.strip().split(' ')[1:]).split(',')[0]}for l inf.readlines()])
            labels = labels_df.sort('synset_id')['name'].values
 
            indices =(-scores).argsort()[:5]
            predictions = labels[indices]
            printpredictions
            printscores
 
            meta = [(p, '%.5f' % scores[i])for i,pin zip(indices, predictions)]
            printmeta
#增加输出结果打印到终端vvvvvvvvvvv
 
 
    # Save
    print("Saving results into %s" % args.output_file)
    np.save(args.output_file, predictions)
 
 
if __name__ =='__main__':
    main(sys.argv)



执行 python classifymnist.py

(需要另外安装numpy pandas)


5.3 准备输出文件synset_words.txt

synset_words.txt输出结果标签文件

0 Zero
1 One
2 Two
3 Three
4 Four
5 Five
6 Six
7 Seven
8 Eight
9 Nine

5.4 准备测试图片文件


1.jpg是28X28的要识别的图片文件,注意是黑底白字,否则无法识别


5.5 准备执行批处理文件

写一个批处理文件runtest.bat,为了执行方便

python classifymnist.py --print_results --force_grayscale--center_only --labels_file synset_words.txt 1.jpg resultsfile


其中1.jpg是28X28的要识别的图片文件,注意是黑底白字,否则无法识别

6、输出结果



 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值