代码实例讲解：卷积神经网络程序细节(附完整代码)

最新推荐文章于 2024-07-18 00:41:51 发布

杨小吴的算法博客

最新推荐文章于 2024-07-18 00:41:51 发布

阅读量1.3w

点赞数 20

分类专栏： AGI-通用人工智能 CNN 文章标签： CNN

本文链接：https://blog.csdn.net/yangzixuan_0608/article/details/103496243

版权

AGI-通用人工智能同时被 2 个专栏收录

13 篇文章 6 订阅

订阅专栏

CNN

1 篇文章 1 订阅

订阅专栏

1、导入数据集和tensorflow包

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

2、初步探索mnist数据集的内容

此处使用mnist数据集，如果需要用自己的数据集，将数据读入pandas的dataframe中即可；

mnist = input_data.read_data_sets("/tmp/data/", one_hot=False)
print(mnist.train.images.shape)
print(mnist.train.labels.shape)
print(mnist.validation.images.shape)
print(mnist.validation.labels.shape)
print(mnist.test.images.shape)
print(mnist.test.labels.shape)
可知mnist由三个部分组成，训练集、测试集、验证集，这三个集合样本个数不一样，但每个样本都是784个元素的数组
想看下长啥样的可以先reshape成28*28的单通道手写图片，然后使用画图工具或者opencv将这个样本集画出来

3、设置神经网络超参数

learning_rate = 0.001
学习率，即是梯度下降时每一步的步长

num_steps = 1000
迭代步数，一般用来设置随机梯度下降进行多少次

batch_size = 128
批次数量，即是一个批次放多少样本，一般可以选择32，64，128
先大后小的设置，批次样本数大，迭代总次数就小，反之就多

n_hidden_1 = 256
第一个隐藏层的神经元个数，关于神经元个数，可以查看我的另外一篇关于深度学习参数调整的博客

n_hidden_2 = 256
第二个隐藏层神经元个数，说明此神经网络至少有三层

num_input = 784
输入样本的特征数，即是每个输入样本有这么多个元素

num_classes = 10
即是最终样本有这么多个label

4、定义卷积神经网络层级结构

def conv_net(x_dict, n_classes, dropout, reuse, is_training):
# tf.variable_scope 让不同命名空间中的变量取相同的名字，其后面的第一个参数是命名空间名称
with tf.variable_scope('ConvNet', reuse=reuse):
# 输入参数x_dict是一个字典
x = x_dict['images']

# mnist数据中的样本格式是(样本个数，784)，这里reshape是因为网络能接受的入参要和网络的data_format一致
# 所以先reshape为 28*28
# -1的意思表示自适应，最终得到的是样本的数量
x = tf.reshape(x, shape=[-1, 28, 28, 1])

# 下面一句话第二个参数32个滤波器，第三个参数表示滤波器大小为5*5 ，第四个参数表示激活函数使用relu
       # 滤波器一般选择2的n次方，比如32，64，滤波器个数越多，提取的feature map越多，但训练参数越多，速度越慢
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)

# 下面一句话第二个参数表示以步数为2，大小为2*2的最大值池化的滤波器
conv1 = tf.layers.max_pooling2d(conv1, 2, 2)

# 卷积核数目设置
# 按照16的倍数倍增，结合了gpu硬件的配置。
# 一个卷积核对应一个初始的卷积核参数矩阵w,得到一种特征对应的激活层（feature map），最终积累的特征数越多，分类效果肯定越好
# 因为每个卷积核的初始矩阵都不相同，所以每个卷积核得到的激活层其实也不相同，可以理解为每个卷积核都是训练并提取了一种特征

# Convolution Layer with 64 filters and a kernel size of 3
conv2 = tf.layers.conv2d(conv1, 64, 3, activation=tf.nn.relu)
# Max Pooling (down-sampling) with strides of 2 and kernel size of 2
conv2 = tf.layers.max_pooling2d(conv2, 2, 2)

# 将计算完的数据，拉伸为1维的方便做全连接处理
fc1 = tf.contrib.layers.flatten(conv2)

# Fully connected layer (in tf contrib folder for now)
# 定义一个全连接网络，第一个参数表示输入的计算结果，第二个参数是输出维度，相当于是做一个降维处理
fc1 = tf.layers.dense(fc1, 1024)
# 定义一个随机失活，需不需要随机失活看training这个参数
fc1 = tf.layers.dropout(fc1, rate=dropout, training=is_training)

# 在进行全连接，第一个参数表示输入，第二个参数表示输出维度，同样是降维处理
out = tf.layers.dense(fc1, n_classes)

return out

5、定义model_fn

def model_fn(features, labels, mode):
# 构建两个卷积神经网络，一个是训练网络，一个是测试网络
logits_train = conv_net(features, num_classes, dropout, reuse=False,
is_training=True)
logits_test = conv_net(features, num_classes, dropout, reuse=True,
is_training=False)

# 定义预测的operation
# 预测的分类器是softmax
pred_classes = tf.argmax(logits_test, axis=1)
pred_probas = tf.nn.softmax(logits_test)

# If prediction mode, early return
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode, predictions=pred_classes)

# 定义损失和优化器
# 损失计算方法为L2
# 优化器选择 adam
# 训练目标为最小化损失函数optimizer.minimize()
loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=logits_train, labels=tf.cast(labels, dtype=tf.int32)))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op,
global_step=tf.train.get_global_step())

# 计算模型的准确率
acc_op = tf.metrics.accuracy(labels=labels, predictions=pred_classes)

# 返回一个EstimatorSpec的对象
# EstimatorSpec对象返回一个准确率的键值对：{'accuracy': acc_op}，所以训练结束后可以通过'accuracy'键返回值
estim_specs = tf.estimator.EstimatorSpec(
mode=mode,
predictions=pred_classes,
loss=loss_op,
train_op=train_op,
eval_metric_ops={'accuracy': acc_op})

return estim_specs

6、创建Estimator实例

model = tf.estimator.Estimator(model_fn)

7、定义训练过程input_fn函数

input_fn = tf.estimator.inputs.numpy_input_fn(
x={'images': mnist.train.images}, y=mnist.train.labels,
batch_size=batch_size, num_epochs=None, shuffle=True)
这个函数的作用是生成一个评估器对象能接受的标准输入格式，其实是一个输入组装的过程

8、开始启动模型训练

前面的所有定义，都是逻辑上的定义，到了这一步才是真的开始触发训练
model.train(input_fn, steps=num_steps)
# Evaluate the Model

9、定义验证过程的input_fn

input_fn = tf.estimator.inputs.numpy_input_fn(
x={'images': mnist.test.images}, y=mnist.test.labels,
batch_size=batch_size, shuffle=False)
逻辑同上面的训练过程

10、开始启动验证过程

e = model.evaluate(input_fn)

11、打印模型在验证集中的准确率

print("Testing Accuracy:", e['accuracy'])

12、附完整代码

from __future__ import division, print_function, absolute_import
# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=False)

import tensorflow as tf

# Training Parameters
learning_rate = 0.001
num_steps = 2000 #这个步数即是随机梯度下降中，随机下降多少次
batch_size = 128 #一个batch_size是128个样本，一个epoch有多个batch

# Network Parameters
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)
dropout = 0.25 # Dropout, probability to drop a unit

# 定义卷积神经网络层级结构
def conv_net(x_dict, n_classes, dropout, reuse, is_training):
    # tf.variable_scope 让不同命名空间中的变量取相同的名字，其后面的第一个参数是命名空间名称
    with tf.variable_scope('ConvNet', reuse=reuse):
        # 输入参数x_dict是一个字典
        x = x_dict['images']

        # mnist数据中的样本格式是(样本个数，784)，这里reshape是因为网络能接受的入参要和网络的data_format一致
          #那mnist数据中的样本都是单通道？
        # 所以先reshape为 28*28
        # -1的意思表示自适应，最终得到的是样本的数量
        x = tf.reshape(x, shape=[-1, 28, 28, 1])

        # 下面一句话第二个参数32个滤波器，第三个参数表示滤波器大小为5*5 ，第四个参数表示激活函数使用relu
        conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)
        # 下面一句话第二个参数表示以步数为2，大小为2*2的最大值池化的滤波器
        conv1 = tf.layers.max_pooling2d(conv1, 2, 2)
        
        # 卷积核数目设置
        # 按照16的倍数倍增，结合了gpu硬件的配置。
        # 一个卷积核对应一个初始的卷积核参数矩阵w,得到一种特征对应的激活层（feature map），最终积累的特征数越多，分类效果肯定越好
        # 因为每个卷积核的初始矩阵都不相同，所以每个卷积核得到的激活层其实也不相同，可以理解为每个卷积核都是训练并提取了一种特征

        
        # Convolution Layer with 64 filters and a kernel size of 3
        conv2 = tf.layers.conv2d(conv1, 64, 3, activation=tf.nn.relu)
        # Max Pooling (down-sampling) with strides of 2 and kernel size of 2
        conv2 = tf.layers.max_pooling2d(conv2, 2, 2)

        # 将计算完的数据，拉伸为1维的方便做全连接处理
        fc1 = tf.contrib.layers.flatten(conv2)

        # Fully connected layer (in tf contrib folder for now)
        # 定义一个全连接网络，第一个参数表示输入的计算结果，第二个参数是输出维度，相当于是做一个降维处理
        fc1 = tf.layers.dense(fc1, 1024)
        # 定义一个随机失活，需不需要随机失活看training这个参数
        fc1 = tf.layers.dropout(fc1, rate=dropout, training=is_training)

        # 在进行全连接，第一个参数表示输入，第二个参数表示输出维度，同样是降维处理
        out = tf.layers.dense(fc1, n_classes)

    return out

# 根据TF的评估器模板，定义训练模型
def model_fn(features, labels, mode):
    # 构建两个卷积神经网络，一个是训练网络，一个是测试网络
    logits_train = conv_net(features, num_classes, dropout, reuse=False,
                            is_training=True)
    logits_test = conv_net(features, num_classes, dropout, reuse=True,
                           is_training=False)

    # 定义预测的operation
    # 预测的分类器是softmax
    pred_classes = tf.argmax(logits_test, axis=1)
    pred_probas = tf.nn.softmax(logits_test)

    # If prediction mode, early return
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, predictions=pred_classes)

    # 定义损失和优化器
    # 损失计算方法为L2
    # 优化器选择 adam
    # 训练目标为最小化损失函数optimizer.minimize()
    loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits_train, labels=tf.cast(labels, dtype=tf.int32)))
    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
    train_op = optimizer.minimize(loss_op,
                                  global_step=tf.train.get_global_step())

    # 计算模型的准确率
    acc_op = tf.metrics.accuracy(labels=labels, predictions=pred_classes)

    # 返回一个EstimatorSpec的对象
    # EstimatorSpec对象返回一个准确率的键值对：{'accuracy': acc_op}，所以训练结束后可以通过'accuracy'键返回值
    estim_specs = tf.estimator.EstimatorSpec(
        mode=mode,
        predictions=pred_classes,
        loss=loss_op,
        train_op=train_op,
        eval_metric_ops={'accuracy': acc_op})

    return estim_specs

# Build the Estimator
model = tf.estimator.Estimator(model_fn)

# input_fn是评估器训练模型的时候能接受的入参的特殊格式
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.train.images}, y=mnist.train.labels,
    batch_size=batch_size, num_epochs=None, shuffle=True)

# Train the Model
model.train(input_fn, steps=num_steps)
# Evaluate the Model

# Define the input function for evaluating
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.test.images}, y=mnist.test.labels,
    batch_size=batch_size, shuffle=False)

# 评估/测试训练好的模型
e = model.evaluate(input_fn)
print("Testing Accuracy:", e['accuracy'])
# 这里得到的是准确率，假如准确率足够的话，怎么使用这个模型进行预测呢？
# Estimator有专门的预测的函数，predict(self, input_fn, predict_keys=None, hooks=None, checkpoint_path=None, yield_single_examples=True)

#使用训练好的模型进行预测
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.test.images}, shuffle=False)
predictions = model.predict(input_fn)  #返回一个预测分类的列表
for pred_dict in predictions:
        print(pred_dict)