caffe2 实例学习Create CNN for mnist

最新推荐文章于 2021-04-16 14:35:44 发布

容花呀_AI

最新推荐文章于 2021-04-16 14:35:44 发布

阅读量460

点赞数

分类专栏： caffe2 文章标签： caffe2 MNIST

caffe2 专栏收录该内容

6 篇文章

订阅专栏

本文详细介绍如何从零开始创建一个卷积神经网络(CNN)，用于识别MNIST数据集中的手写数字。教程涵盖数据准备、模型构建、训练及测试全过程，通过Python和Caffe2实现。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 MNIST-从Scratch创建CNN

翻译官方MNIST CNN 教程

https://caffe2.ai/docs/tutorial-MNIST.html

本教程创建了一个可以识别手写的小型卷积神经网络（CNN）。为了训练和测试CNN，我们使用来自MNIST数据集的手写图像。这是一个由500个不同人的笔迹组成的60,000张图片的集合，用于训练您的CNN。另一组10,000个测试图像（与训练图像不同）用于测试所得CNN的准确性。

（1）首先，让我们import 模块。

%matplotlib inline
from matplotlib import pyplot
import numpy as np
import os
import shutil # shutil 高级的文件、文件夹、压缩包处理模块
import caffe2.python.predictor.predictor_exporter as pe

from caffe2.python import core, model_helper, net_drawer, workspace, visualize, brew

# If you would like to see some really detailed initializations,
# you can change --caffe2_log_level=0 to --caffe2_log_level=-1
core.GlobalInit(['caffe2', '--caffe2_log_level=0'])
print("Necessities imported!")

# This section preps your image and test set in a lmdb database
def DownloadResource(url, path):
    '''Downloads resources from s3 by url and unzips them to the provided path'''
    import requests, zipfile, StringIO
    print("Downloading... {} to {}".format(url, path))
    r = requests.get(url, stream=True) #获取网页链接
    z = zipfile.ZipFile(StringIO.StringIO(r.content)) #r.content是从网页链接中读取内容，返回的是zip文件，用StringIO.StringIO()转为stream buffer，然后用zipfile.ZipFile()解压。
    z.extractall(path) #将所有文件按照namelist中显示得那样的目录结构从当前zip中提取出来并放到path下。
    print("Completed download and extraction.")


current_folder = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks') # os.path.expanduser('~')获取用户目录，即$HOME。用os.path.join()连接两个文件名地址的时候
data_folder = os.path.join(current_folder, 'tutorial_data', 'mnist')
root_folder = os.path.join(current_folder, 'tutorial_files', 'tutorial_mnist')
db_missing = False

if not os.path.exists(data_folder):
    os.makedirs(data_folder)   # os.makedirs()创建路径
    print("Your data folder was not found!! This was generated: {}".format(data_folder))

# Look for existing database: lmdb
if os.path.exists(os.path.join(data_folder,"mnist-train-nchw-lmdb")):
    print("lmdb train db found!")
else:
    db_missing = True

if os.path.exists(os.path.join(data_folder,"mnist-test-nchw-lmdb")):
    print("lmdb test db found!")
else:
    db_missing = True

# attempt the download of the db if either was missing
if db_missing:
    print("one or both of the MNIST lmbd dbs not found!!")
    db_url = "http://download.caffe2.ai/databases/mnist-lmdb.zip"
    try:
        DownloadResource(db_url, data_folder)
    except Exception as ex:
        print("Failed to download dataset. Please download it manually from {}".format(db_url))
        print("Unzip it and place the two database folders here: {}".format(data_folder))
        raise ex

if os.path.exists(root_folder):
    print("Looks like you ran this before, so we need to cleanup those old files...")
    shutil.rmtree(root_folder) # shutil.rmtree() #递归地删除文件

os.makedirs(root_folder)
workspace.ResetWorkspace(root_folder) # workspace所有数据所在的工作区，ResetWorkspace()清除目前工作区里的内容

print("training data folder:" + data_folder)
print("workspace root folder:" + root_folder)

运行结果：

lmdb train db found!
lmdb test db found!
Looks like you ran this before, so we need to cleanup those old files...
training data folder:/Users/aaronmarkham/caffe2_notebooks/tutorial_data/mnist
workspace root folder:/Users/aaronmarkham/caffe2_notebooks/tutorial_files/tutorial_mnist

使用ModelHelper该类来表示我们的主模型并使用brew模块并Operators构建我们的模型。Brew模块有一组包装函数，可以自动将参数初始化和实际计算分成两个网络。引擎盖下，一个ModelHelper对象具有两个隐藏的网络，即param_init_net和net，分别保持初始化网络和主要网络的记录。
为了模块化，我们将模型分成多个不同的部分：

(1) The data input part (AddInput function)
(2) The main computation part (AddLeNetModel function)
(3) The training part - adding gradient operators, update, etc. (AddTrainingOperators function)
(4) The bookkeeping part, where we just print out statistics for inspection. (AddBookkeepingOperators function)

AddInput将从数据库加载数据。我们将MNIST数据存储在像素值中，因此在批处理之后，这将为我们提供具有形状的数据(batch_size, num_channels, width, height)，在这种情况下[batch_size, 1, 28, 28]为数据类型uint8和形状[batch_size]为数据类型为int的标签。

由于我们要进行浮点计算，我们将把数据转换为浮点数据类型。为了获得更好的数值稳定性，我们将其缩小为[0,1]，而不是在[0,255]范围内表示数据。请注意，我们正在为此运算符进行就地计算：我们不需要预缩放数据。现在，在计算反向传递时，我们不需要向后传递的梯度计算。StopGradient确切地说：在前向传递中它什么都不做，而在后向传递中它所做的只是告诉渐变生成器“渐变不需要通过我”。

def AddInput(model, batch_size, db, db_type):
    # 下载数据
    data_uint8, label = model.TensorProtosDBInput(
        [], ["data_uint8", "label"], batch_size=batch_size,
        db=db, db_type=db_type)
    # cast the data to float 将数据转化为float
    data = model.Cast(data_uint8, "data", to=core.DataType.FLOAT)
    # scale data from [0,255] down to [0,1] 将数据从[0,255]缩小为[0,1]
    data = model.Scale(data, data, scale=float(1./256))
    # don't need the gradient for the backward pass 向后传递不需要渐变
    data = model.StopGradient(data, data)
return data, label

def AddLeNetModel(model, data):
    '''
    This part is the standard LeNet model: from data to the softmax prediction.

    For each convolutional layer we specify dim_in - number of input channels
    and dim_out - number or output channels. Also each Conv and MaxPool layer changes the
    image size. For example, kernel of size 5 reduces each side of an image by 4.

    While when we have kernel and stride sizes equal 2 in a MaxPool layer, it divides
each side in half.
这部分是标准的LeNet模型：从数据到softmax预测。
对于每个卷积层，我们指定dim_in - 输入通道的数量
和dim_out - 数字或输出通道。 每个Conv和MaxPool层也会改变图片大小。 例如，大小为5的内核将图像的每一侧减少4。
当我们在MaxPool层中有内核和步幅大小等于2时，它会分开每一边都是一半。
    '''
    # Image size: 28 x 28 -> 24 x 24
    conv1 = brew.conv(model, data, 'conv1', dim_in=1, dim_out=20, kernel=5)
    # Image size: 24 x 24 -> 12 x 12
    pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2)
    # Image size: 12 x 12 -> 8 x 8
    conv2 = brew.conv(model, pool1, 'conv2', dim_in=20, dim_out=100, kernel=5)
    # Image size: 8 x 8 -> 4 x 4
    pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2)
    # 50 * 4 * 4 stands for dim_out from previous layer multiplied by the image size＃50 * 4 * 4代表前一层的dim_out乘以图像大小
    fc3 = brew.fc(model, pool2, 'fc3', dim_in=100 * 4 * 4, dim_out=500)

    #fc3 = brew.relu(model, fc3, fc3)
    relu = brew.relu(model, fc3, fc3)
    pred = brew.fc(model, relu, 'pred', 500, 10)
    softmax = brew.softmax(model, pred, 'softmax')
    return softmax
def AddAccuracy(model, softmax, label):# 为模型添加了精度运算符
    """Adds an accuracy op to the model"""
    accuracy = brew.accuracy(model, [softmax, label], "accuracy")
    return accuracy

def AddTrainingOperators(model, softmax, label): #训练操作符添加到模型中
    """Adds training operators to the model."""
    xent = model.LabelCrossEntropy([softmax, label], 'xent') # model.LabelCrossEntropy()计算输入和标签集之间的交叉熵
    # compute the expected loss
    loss = model.AveragedLoss(xent, "loss") #model.AveragedLoss()考虑交叉熵并返回交叉熵中发现的损失的平均值
    # track the accuracy of the model
    AddAccuracy(model, softmax, label) #通过调用AddAccuracy函数来计算模型的准确性
    # use the average loss we just computed to add gradient operators to the model
    model.AddGradientOperators([loss]) #将所有梯度算子添加到模型中。根据我们上面计算的损失计算梯度
    # do a simple stochastic gradient descent #迭代次数计数器
    ITER = brew.iter(model, "iter")
    # set the learning rate schedule #设置学习率
    LR = model.LearningRate(
        ITER, "LR", base_lr=-0.1, policy="step", stepsize=1, gamma=0.999 )
    # ONE is a constant value that is used in the gradient update. We only need
    # to create it once, so it is explicitly placed in param_init_net. # ONE是梯度更新中使用的常量值。 我们只需要创建一次，因此它显式放在param_init_net中。
    ONE = model.param_init_net.ConstantFill([], "ONE", shape=[1], value=1.0)
    # Now, for each parameter, we do the gradient updates. # 更新梯度
    for param in model.params:
        # Note how we get the gradient of each parameter - ModelHelper keeps
        # track of that.
        param_grad = model.param_to_grad[param]
        # The update is a simple weighted sum: param = param + param_grad * LR
        model.WeightedSum([param, ONE, param_grad, LR], param)

def AddBookkeepingOperators(model):
    """This adds a few bookkeeping operators that we can inspect later.

    These operators do not affect the training procedure: they only collect
statistics and prints them to file or to logs.
# 收集统计信息，并保存到文件
    """    
    # Print basically prints out the content of the blob. to_file=1 routes the
    # printed output to a file. The file is going to be stored under
    #     root_folder/[blob name]
    model.Print('accuracy', [], to_file=1) # 输出 blob的内容. to_file=1 表示输出到文件，文件保存的路径是 root_folder/[blob name]
    model.Print('loss', [], to_file=1)
    # Summarizes the parameters. Different from Print, Summarize gives some
# statistics of the parameter, such as mean, std, min and max.
#总结参数。 与Print不同，Summarize给出了参数的一些统计数据，例如mean，std，min和max。
#现在，如果我们真的想要冗长，我们可以总结模型产生的每个blob; 这可能不是一个好主意，因为这需要时间 - 摘要不是免费的。 对于此演示，我们仅展示如何汇总参数及其梯度。
    for param in model.params:
        model.Summarize(param, [], to_file=1)
        model.Summarize(model.param_to_grad[param], [], to_file=1)

arg_scope = {"order": "NCHW"} # MNIST数据集上使用NCHW的储存顺序
train_model = model_helper.ModelHelper(name="mnist_train", arg_scope=arg_scope)
data, label = AddInput(
    train_model, batch_size=64,
    db=os.path.join(data_folder, 'mnist-train-nchw-lmdb'),
    db_type='lmdb')
softmax = AddLeNetModel(train_model, data)
AddTrainingOperators(train_model, softmax, label)
AddBookkeepingOperators(train_model)

# Testing model. We will set the batch size to 100, so that the testing
# pass is 100 iterations (10,000 images in total).
# For the testing model, we need the data input part, the main LeNetModel
# part, and an accuracy part. Note that init_params is set False because
# we will be using the parameters obtained from the train model.
#测试模型。 我们将批量大小设置为100，以便测试通过100次迭代（总共10,000个图像）。 对于测试模型，我们需要数据输入部分，主要LeNetModel部分和精度部分。 请注意，init_params设置为False，因为我们将使用从train模型获得的参数。
test_model = model_helper.ModelHelper(
    name="mnist_test", arg_scope=arg_scope, init_params=False)
data, label = AddInput(
    test_model, batch_size=100,
    db=os.path.join(data_folder, 'mnist-test-nchw-lmdb'),
    db_type='lmdb')
softmax = AddLeNetModel(test_model, data)
AddAccuracy(test_model, softmax, label)

# Deployment model. We simply need the main LeNetModel part.
deploy_model = model_helper.ModelHelper(
    name="mnist_deploy", arg_scope=arg_scope, init_params=False)
AddLeNetModel(deploy_model, "data")
# You may wonder what happens with the param_init_net part of the deploy_model.
# No, we will not use them, since during deployment time we will not randomly
# initialize the parameters, but load the parameters from the db.
# 您可能想知道deploy_model的param_init_net部分会发生什么。 不，我们不会使用它们，因为在部署期间我们不会随机初始化参数，而是从db加载参数。

# 使用Caffe2具有的简单图形可视化工具
from IPython import display
graph = net_drawer.GetPydotGraph(train_model.net.Proto().op, "mnist", rankdir="LR")
display.Image(graph.create_png(), width=800)
graph = net_drawer.GetPydotGraphMinimal(
    train_model.net.Proto().op, "mnist", rankdir="LR", minimal_dependency=True)
display.Image(graph.create_png(), width=800)
print(str(train_model.param_init_net.Proto())[:400] + '\n...') #训练模型的param_init_net显示一部分序列化协议缓冲区

# 将所有协议缓冲区转储到磁盘，以便您可以轻松地检查它们。您可能已经注意到，这些协议缓冲区很像旧的caffe的网络定义
with open(os.path.join(root_folder, "train_net.pbtxt"), 'w') as fid:
    fid.write(str(train_model.net.Proto()))
with open(os.path.join(root_folder, "train_init_net.pbtxt"), 'w') as fid:
    fid.write(str(train_model.param_init_net.Proto()))
with open(os.path.join(root_folder, "test_net.pbtxt"), 'w') as fid:
    fid.write(str(test_model.net.Proto()))
with open(os.path.join(root_folder, "test_init_net.pbtxt"), 'w') as fid:
    fid.write(str(test_model.param_init_net.Proto()))
with open(os.path.join(root_folder, "deploy_net.pbtxt"), 'w') as fid:
    fid.write(str(deploy_model.net.Proto()))
print("Protocol buffers files have been created in your root folder: " + root_folder)

#使用Python驱动所有计算
# The parameter initialization network only needs to be run once.# 必须初始化网络
workspace.RunNetOnce(train_model.param_init_net)
# creating the network #由于我们要多次运行主网络，我们首先创建网络，将从protobuf生成的实际网络放入工作空间。
workspace.CreateNet(train_model.net, overwrite=True)
# set the number of iterations and track the accuracy & loss #将运行网络的迭代次数设置为200，并创建两个numpy数组来记录每次迭代的准确性和损失。
total_iters = 200
accuracy = np.zeros(total_iters)
loss = np.zeros(total_iters)
# Now, we will manually run the network for 200 iterations.
# 通过网络和跟踪准确性和丢失设置，我们现在可以循环调用200个交互workspace.RunNet并传递网络名称train_model.net.Proto().name。在每次迭代中，我们用workspace.FetchBlob('accuracy')和计算精度和损失workspace.FetchBlob('loss')。
for i in range(total_iters):
    workspace.RunNet(train_model.net)
    accuracy[i] = workspace.FetchBlob('accuracy') #计算精度
    loss[i] = workspace.FetchBlob('loss') # 计算损失
# 绘制结果
pyplot.plot(loss, 'b')
pyplot.plot(accuracy, 'r')
pyplot.legend(('Loss', 'Accuracy'), loc='upper right')

#对一些数据和预测进行抽样
# Let's look at some of the data.
pyplot.figure()
data = workspace.FetchBlob('data')  # workspace.FetchBlob()读取数据 blob型
_ = visualize.NCHW.ShowMultiple(data)
pyplot.figure()
softmax = workspace.FetchBlob('softmax')
_ = pyplot.plot(softmax[0], 'ro')
pyplot.title('Prediction for the first image')
# Convolutions for this mini-batch
pyplot.figure()
conv = workspace.FetchBlob('conv1')
shape = list(conv.shape)
shape[1] = 1
# We can look into any channel. This of it as a feature model learned #我们可以查看任何通道。 这是作为特征模型学到的
conv = conv[:,15,:,:].reshape(shape)

_ = visualize.NCHW.ShowMultiple(conv)

#请注意，尽管test_model将使用从train_model获取的参数，但仍必须运行test_model.param_init_net来初始化输入数据。
# run a test pass on the test net
workspace.RunNetOnce(test_model.param_init_net) # 运行一次参数初始化网络
workspace.CreateNet(test_model.net, overwrite=True) #创建训练网络
test_accuracy = np.zeros(100)
for i in range(100): #网络训练
    workspace.RunNet(test_model.net.Proto().name)
    test_accuracy[i] = workspace.FetchBlob('accuracy')
# 绘制结果
pyplot.plot(test_accuracy, 'r')
pyplot.title('Acuracy over test batches.')
print('test_accuracy: %f' % test_accuracy.mean())

#使用经过训练的权重和偏差将部署模型保存到文件中
# construct the model to be exported
# the inputs/outputs of the model are manually specified.
pe_meta = pe.PredictorExportMeta(
    predict_net=deploy_model.net.Proto(),
    parameters=[str(b) for b in deploy_model.params],
    inputs=["data"],
    outputs=["softmax"],
)

# save the model to a file. Use minidb as the file format #将模型保存到文件。 使用minidb作为文件格式
pe.save_to_db("minidb", os.path.join(root_folder, "mnist_model.minidb"), pe_meta)
print("The deploy model is saved to: " + root_folder + "/mnist_model.minidb")

#现在我们可以加载模型并运行预测以验证它是否有效。
# we retrieve the last input data out and use it in our prediction test before we scratch the workspace
blob = workspace.FetchBlob("data")
pyplot.figure()
_ = visualize.NCHW.ShowMultiple(blob)

# reset the workspace, to make sure the model is actually loaded #重置工作区，以确保实际加载模型
workspace.ResetWorkspace(root_folder)

# verify that all blobs are destroyed.
print("The blobs in the workspace after reset: {}".format(workspace.Blobs()))

# load the predict net #下载预测网络
predict_net = pe.prepare_prediction_net(os.path.join(root_folder, "mnist_model.minidb"), "minidb")

# verify that blobs are loaded back #验证blob是否已加载回来
print("The blobs in the workspace after loading the model: {}".format(workspace.Blobs()))

# feed the previously saved data to the loaded model
workspace.FeedBlob("data", blob) # 输入数据

# predict 预测
workspace.RunNetOnce(predict_net)
softmax = workspace.FetchBlob("softmax")

# the first letter should be predicted correctly
pyplot.figure()
_ = pyplot.plot(softmax[0], 'ro') #画图
pyplot.title('Prediction for the first image')