Caffe研究实践二 ------准备数据训练测试

最新推荐文章于 2019-08-02 16:22:07 发布

cv.exp

最新推荐文章于 2019-08-02 16:22:07 发布

阅读量4.9k

点赞数

分类专栏： Deep Learning 文章标签： minist 训练测试 caffe

本文链接：https://blog.csdn.net/forest_world/article/details/51376554

版权

Deep Learning 专栏收录该内容

172 篇文章 6 订阅

订阅专栏

一、准备样本数据

获取minist的数据包。
这个版本是四个数据包

learning@learning-virtual-machine:~/caffe/data/mnist$ ./get_mnist.sh

learning@learning-virtual-machine:~/caffe/data/mnist$ ls
get_mnist.sh
learning@learning-virtual-machine:~/caffe/data/mnist$ ./get_mnist.sh 
Downloading...
--2016-05-11 17:35:10--  http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Resolving yann.lecun.com (yann.lecun.com)... 128.122.47.89
Connecting to yann.lecun.com (yann.lecun.com)|128.122.47.89|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9912422 (9.5M) [application/x-gzip]
Saving to: ‘train-images-idx3-ubyte.gz’

train-images-idx 100%[===========>]   9.45M   225KB/s   in 73s    

2016-05-11 17:36:24 (133 KB/s) - ‘train-images-idx3-ubyte.gz’ saved [9912422/9912422]

--2016-05-11 17:36:38--  http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Resolving yann.lecun.com (yann.lecun.com)... 128.122.47.89
Connecting to yann.lecun.com (yann.lecun.com)|128.122.47.89|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 28881 (28K) [application/x-gzip]
Saving to: ‘train-labels-idx1-ubyte.gz’

train-labels-idx 100%[===========>]  28.20K  1.41KB/s   in 7.5s   

2016-05-11 17:36:49 (3.75 KB/s) - ‘train-labels-idx1-ubyte.gz’ saved [28881/28881]

--2016-05-11 17:36:49--  http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Resolving yann.lecun.com (yann.lecun.com)... 128.122.47.89
Connecting to yann.lecun.com (yann.lecun.com)|128.122.47.89|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1648877 (1.6M) [application/x-gzip]
Saving to: ‘t10k-images-idx3-ubyte.gz’

t10k-images-idx3 100%[===========>]   1.57M  71.9KB/s   in 19s    

2016-05-11 17:37:08 (85.5 KB/s) - ‘t10k-images-idx3-ubyte.gz’ saved [1648877/1648877]

--2016-05-11 17:37:09--  http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Resolving yann.lecun.com (yann.lecun.com)... 128.122.47.89
Connecting to yann.lecun.com (yann.lecun.com)|128.122.47.89|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4542 (4.4K) [application/x-gzip]
Saving to: ‘t10k-labels-idx1-ubyte.gz’

t10k-labels-idx1 100%[===========>]   4.44K  --.-KB/s   in 0s     

2016-05-11 17:37:09 (31.5 MB/s) - ‘t10k-labels-idx1-ubyte.gz’ saved [4542/4542]

learning@learning-virtual-machine:~/caffe/data/mnist$

get_mnist.sh代码：

#!/usr/bin/env sh
# This scripts downloads the mnist data and unzips it.

DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd $DIR

echo "Downloading..."

for fname in train-images-idx3-ubyte train-labels-idx1-ubyte t10k-images-idx3-ubyte t10k-labels-idx1-ubyte
do
    if [ ! -e $fname ]; then
        wget --no-check-certificate http://yann.lecun.com/exdb/mnist/${fname}.gz
        gunzip ${fname}.gz
    fi
done

执行./examples/mnist/create_mnist.sh

create_mnist.sh代码：

#!/usr/bin/env sh
# This script converts the mnist data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.

EXAMPLE=examples/mnist
DATA=data/mnist
BUILD=build/examples/mnist

BACKEND="lmdb"

echo "Creating ${BACKEND}..."

rm -rf $EXAMPLE/mnist_train_${BACKEND}
rm -rf $EXAMPLE/mnist_test_${BACKEND}

$BUILD/convert_mnist_data.bin $DATA/train-images-idx3-ubyte \
  $DATA/train-labels-idx1-ubyte $EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND}
$BUILD/convert_mnist_data.bin $DATA/t10k-images-idx3-ubyte \
  $DATA/t10k-labels-idx1-ubyte $EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND}

echo "Done."

create_mnist.sh是利用caffe/build/examples/mnist/的convert_mnist_data.bin工具，
将mnist date转化为可用的lmdb格式的文件。
并将新生成的2个文件mnist-train-lmdb 和 mnist-test-lmdb放于create_mnist.sh同目录下。
learning@learning-virtual-machine:~/caffe$ ./examples/mnist/create_mnist.sh
Creating lmdb…
I0511 17:50:53.378334 63891 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb
I0511 17:50:53.382064 63891 convert_mnist_data.cpp:88] A total of 60000 items.
I0511 17:50:53.382319 63891 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I0511 17:51:26.376051 63891 convert_mnist_data.cpp:108] Processed 60000 files.
I0511 17:51:26.533220 63894 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb
I0511 17:51:26.534319 63894 convert_mnist_data.cpp:88] A total of 10000 items.
I0511 17:51:26.534453 63894 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I0511 17:51:31.699584 63894 convert_mnist_data.cpp:108] Processed 10000 files.
Done.
learning@learning-virtual-machine:~/caffe$

二、训练

learning@learning-virtual-machine:~/caffe$ ./examples/mnist/train_lenet.sh
出现问题：
I0511 17:52:25.115056 63914 caffe.cpp:185] Using GPUs 0
F0511 17:52:25.116345 63914 common.cpp:66] Cannot use GPU in CPU-only Caffe: check mode.
..* Check failure stack trace: *
@ 0x7f6c824c65cd google::LogMessage::Fail()
@ 0x7f6c824c8433 google::LogMessage::SendToLog()
@ 0x7f6c824c615b google::LogMessage::Flush()
@ 0x7f6c824c8e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f6c8284c7f0 caffe::Caffe::SetDevice()
@ 0x40a1f3 train()
@ 0x406e80 main
@ 0x7f6c81753a40 __libc_start_main
@ 0x407539 _start
@ (nil) (unknown)
Aborted (core dumped)
learning@learning-virtual-machine:~/caffe$

解决问题：
这里写图片描述

learning@learning-virtual-machine:~/caffe/examples/mnist$ sudo gedit lenet_solver.prototxt

这里写图片描述

learning@learning-virtual-machine:~/caffe$ ./examples/mnist/train_lenet.sh

train_lenet.sh文件

#!/usr/bin/env sh

./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt

lenet_solver.prototxt文件

# The train/test net protocol buffer definition
//网络协议具体定义
net: "examples/mnist/lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch（批） size 100 and 100 test iterations（迭代）,
# covering the full 10,000 testing images.
test_iter: 100//测试迭代次数 如果batch size=100,则100张图一批，训练100次，则可以覆盖10000张图的需求  
# Carry out testing every 500 training iterations.
test_interval: 500//训练迭代500次，测试一次  
# The base learning rate, momentum and the weight decay of the network.//网络参数：学习率，动量，权重的衰减  
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy//学习策略：有固定学习率和每步递减学习率  （step）
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100  //每迭代100次显示一次   
# The maximum number of iterations
max_iter: 10000 //最大迭代次数  
# snapshot intermediate results
snapshot: 5000  // 每5000次迭代存储一次数据，路径前缀是<<SPAN
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
solver_mode: CPU

训练完毕：

oss = 0.0069052 (* 1 = 0.0069052 loss)
I0512 08:48:30.908164  2545 sgd_solver.cpp:106] Iteration 9900, lr = 0.00596843
I0512 08:48:50.392801  2545 solver.cpp:454] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
I0512 08:48:50.468799  2545 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
I0512 08:48:50.574422  2545 solver.cpp:317] Iteration 10000, loss = 0.00475874
I0512 08:48:50.574723  2545 solver.cpp:337] Iteration 10000, Testing net (#0)
I0512 08:49:05.222337  2545 solver.cpp:404]     Test net output #0: accuracy = 0.9912
I0512 08:49:05.223168  2545 solver.cpp:404]     Test net output #1: loss = 0.0288119 (* 1 = 0.0288119 loss)
I0512 08:49:05.223314  2545 solver.cpp:322] Optimization Done.
I0512 08:49:05.223942  2545 caffe.cpp:222] Optimization Done.
learning@learning-virtual-machine:~/caffe$

这里写图片描述

三、测试

./build/tools/caffe.bin test -model=examples/mnist/lenet_train_test.prototxt -weights=examples/mnist/lenet_iter_10000.caffemodel

lenet_train_test.prototxt分析

http://blog.csdn.net/forest_world/article/details/51381522

test：表示对训练好的模型进行Testing，而不是training。其他参数包括train, time, device_query。
-model=XXX：指定模型prototxt文件，这是一个文本文件，详细描述了网络结构和数据集信息

I0512 09:18:41.455747  3503 caffe.cpp:275] Batch 44, loss = 0.0137619
I0512 09:18:41.671058  3503 caffe.cpp:275] Batch 45, accuracy = 0.99
I0512 09:18:41.671362  3503 caffe.cpp:275] Batch 45, loss = 0.0446652
I0512 09:18:41.910468  3503 caffe.cpp:275] Batch 46, accuracy = 1
I0512 09:18:41.910781  3503 caffe.cpp:275] Batch 46, loss = 0.00462838
I0512 09:18:42.082020  3503 caffe.cpp:275] Batch 47, accuracy = 0.99
I0512 09:18:42.082260  3503 caffe.cpp:275] Batch 47, loss = 0.0215265
I0512 09:18:42.297307  3503 caffe.cpp:275] Batch 48, accuracy = 0.96
I0512 09:18:42.301200  3503 caffe.cpp:275] Batch 48, loss = 0.0964929
I0512 09:18:42.576354  3503 caffe.cpp:275] Batch 49, accuracy = 1
I0512 09:18:42.576627  3503 caffe.cpp:275] Batch 49, loss = 0.00345927
I0512 09:18:42.576732  3503 caffe.cpp:280] Loss: 0.0427004
I0512 09:18:42.576843  3503 caffe.cpp:292] accuracy = 0.9872
I0512 09:18:42.576954  3503 caffe.cpp:292] loss = 0.0427004 (* 1 = 0.0427004 loss)
learning@learning-virtual-machine:~/caffe$