三维卷积神经网络预测MNIST数字详解

最新推荐文章于 2021-11-11 21:56:25 发布
gldbys
最新推荐文章于 2021-11-11 21:56:25 发布
阅读量846
点赞数
分类专栏：机器学习文章标签：深度学习 tensorflow 神经网络 python 卷积
本文链接：https://blog.csdn.net/gldbys/article/details/109612170
版权
机器学习专栏收录该内容
13 篇文章 4 订阅
订阅专栏
from __future__ import division,print_function
import tensorflow as tf
import  matplotlib.pyplot as plt
import numpy as np
#导入mnist数据集
from tensorflow.examples.tutorials.mnist import  input_data
mnist = input_data.read_data_sets('MNIST_data/',one_hot=True)

'''
设置batch_size等超参数
'''
learning_rate = 0.001
training_iters = 500
batch_size = 128
'''
display_step ：控制报告的粒度
例如：若display_step = 2，则将每训练2个样本输出依次损失粒度，
与超参数不同，修改display_step 不会改变模型学习的规律

通俗的说就是每隔10个训练样本打印输出一次
'''
display_step = 10
#网络参数
n_input = 784  #28*28
n_classes = 10
'''
dropout：意为抛弃，取值为0-1之间，表示隐藏层中的节点在每次迭代中
被遗弃的概率，通过在迭代中遗弃部分节点来使更多的节点都能取到比较是合适的值
'''
dropout = 0.85

#声明占位符
'''
[None,n_input]中的None表示一维值
'''
x = tf.placeholder(tf.float32,[None,n_input])
y = tf.placeholder(tf.float32,[None,n_classes])
'''
keep-prob 是一个具体数字，上个示例
中它是 0.5，而本例中它是 0.8，它表示保留某个隐藏单元的概率，
此处 keep-prob 等于 0.8，
它意味着消除任意一个隐藏单元的概率是 0.2
'''
keep_prob = tf.placeholder(tf.float32)

'''
定义一个输入为x，权值为W，偏置为b，给定步幅的卷积层，激活函数是ReLu，padding
设置为Same模式
strides:步长，
strides参数表示的是滑窗在输入张量各个维度上的移动步长
而且一般要求 strides的参数，strides[0] = strides[3] = 1
具体什么含义呢？
一般而言，对于输入张量（input tensor）有四维信息：[batch, height, width, channels]（分别表示 batch_size, 
也即样本的数目，单个样本的行数和列数，样本的频道数，
rgb图像就是三维的，灰度图像则是一维），
对于一个二维卷积操作而言，其主要作用在 height, width上。
strides参数确定了滑动窗口在各个维度上移动的步数。
一种常用的经典设置就是要求，strides[0]=strides[3]=1。
strides[0] = 1，也即在 batch 维度上的移动为 1，也就是不跳过任何一个样本，否则当初也不该把它们作为输入（input）
strides[3] = 1，也即在 channels 维度上的移动为 1，也就是不跳过任何一个颜色通道；

padding设置same模式
padding一般有两种模式 same ,valid
same模式：在卷积核做卷积的过程中（假如卷积核是2*2 但后续不足2*2的话，
same模式会给空缺值补0，从而使得特征图大小不发生改变）
valid模式：在卷积过程中如果后续不足卷积核大小，则后续的值将会被舍弃
这种方法的特征图一般来说会变小
'''
def conv2d(x,W,b,strides=1):
    x = tf.nn.conv2d(x,W,strides=[1,strides,strides,1],padding='SAME')
    '''
tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)
除去name参数用以指定该操作的name，与方法有关的一共五个参数：
第一个参数input：指需要做卷积的输入图像，它要求是一个Tensor，
具有[batch, in_height, in_width, in_channels]这样的shape，
具体含义是[训练时一个batch的图片数量, 图片高度, 图片宽度, 图像通道数]，
注意这是一个4维的Tensor，要求类型为float32和float64其中之一
第二个参数filter：相当于CNN中的卷积核，
它要求是一个Tensor，具有[filter_height, filter_width, in_channels, out_channels]这样的shape，
具体含义是[卷积核的高度，卷积核的宽度，图像通道数，卷积核个数]，
要求类型与参数input相同，
有一个地方需要注意，第三维in_channels，就是参数input的第四维
第三个参数strides：卷积时在图像每一维的步长，这是一个一维的向量，长度4
第四个参数padding：string类型的量，只能是"SAME","VALID"其中之一，
这个值决定了不同的卷积方式
第五个参数：use_cudnn_on_gpu:bool类型，
是否使用cudnn加速，默认为true
    '''
    x=tf.nn.bias_add(x,b)
    '''
    tf.nn.bias_add()： 
    通俗解释：
    一个叫bias的向量加到一个叫value的矩阵上，是向量与矩阵的每一行进行相加，得到的结果和value矩阵大小相同。
    '''
    return tf.nn.relu(x)
'''
定义一个输入是X的maxpool层，卷积核为ksize并且padding为SAME：
'''
def maxpool2d(x,k=2):
    return tf.nn.max_pool(x,ksize=[1,k,k,1],strides=[1,k,k,1],padding='SAME')
'''
定义convnet,其构成是两个卷积层，然后是全连接层，一个dropout层，最后是输出层
'''
def conv_net(x,weights,biases,dropout):
    #reshape the input picture
    x = tf.reshape(x,shape=[-1,28,28,1])
    #first convolution layer
    conv1 = conv2d(x,weights['wc1'],biases['bc1'])
    conv1 = maxpool2d(conv1,k=2)
    conv2 = conv2d(conv1,weights['wc2'],biases['bc2'])
    conv2 = maxpool2d(conv2,k=2)
    fc1 = tf.reshape(conv2,[-1,weights['wd1'].get_shape().as_list()[0]])
    fc1 = tf.add(tf.matmul(fc1,weights['wd1']),biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    fc1 = tf.nn.dropout(fc1,dropout)
    out = tf.add(tf.matmul(fc1,weights['out']),biases['out'])
    return out
'''
定义网络层的权重和偏置。
第一个 conv 层有一个 5×5 的卷积核，
1 个输入和 32 个输出。
第二个 conv 层有一个 5×5 的卷积核，
32 个输入和 64 个输出。
全连接层有 7×7×64 个输入和 1024 个输出，
而第二层有 1024 个输入和 10 个输出对应于最后的数字数目。
所有的权重和偏置用 randon_normal 分布完成初始化：
'''

weights = {
    'wc1':tf.Variable(tf.random_normal([5,5,1,32])),
    'wc2':tf.Variable(tf.random_normal([5,5,32,64])),
    'wd1':tf.Variable(tf.random_normal([7*7*64,1024])),
    'out':tf.Variable(tf.random_normal([1024,n_classes]))
}

biaese = {
    'bc1':tf.Variable(tf.random_normal([32])),
    'bc2':tf.Variable(tf.random_normal([64])),
    'bd1':tf.Variable(tf.random_normal([1024])),
    'out':tf.Variable(tf.random_normal([n_classes]))
}

'''
建立一个给定权重和偏置的 convnet。
定义基于 cross_entropy_with_logits 的损失函数，
并使用 Adam 优化器进行损失最小化。优化后，计算精度：
'''
pred = conv_net(x,weights,biaese,keep_prob)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
init = tf.global_variables_initializer()

'''
启动计算图并迭代 training_iterats次
，其中每次输入 batch_size 个数据进行优化。
请注意，用从 mnist 数据集分离出的 mnist.train 数据进行训练。
每进行 display_step 次迭代，
会计算当前的精度。
最后，在 2048 个测试图片上计算精度，此时无 dropout
'''
train_loss = []
train_acc = []
test_acc = []

with tf.Session() as sess:
    sess.run(init)
    step = 1
    while step<=training_iters:
        batch_x,batch_y = mnist.train.next_batch(batch_size)
        sess.run(optimizer,feed_dict={x:batch_x,y:batch_y,keep_prob:dropout})
        if step % display_step==0:
            loss_train,acc_train = sess.run([cost,accuracy],feed_dict={x:batch_x,y:batch_y,keep_prob:1.})

            print("Iter"+str(step)+",Minibatch Loss = "+"{:.2f}".format(loss_train)+",Training Accuracy="+"{:.2f}".format(acc_train))
            #calculate accuracy for 2048 mnist test images
            #Note that in this case no dropout
            acc_test = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.})
            print("Testing Accuracy:"+"{:.2f}".format(acc_train))
            train_loss.append(loss_train)
            train_acc.append(acc_train)
            test_acc.append(acc_test)
        step+=1



    eval_indices =range(0,training_iters,display_step)
    plt.plot(eval_indices,train_loss,'k-')
    plt.title('Softmax Loss Per iteration')
    plt.xlabel('Iteration')
    plt.ylabel('Softmax Loss')
    plt.show()
    plt.plot(eval_indices,train_acc,'k-',label='Train Set Accuracy')
    plt.plot(eval_indices,test_acc,'r--',label='Test Set Accuracy')
    plt.xlabel('Generation')
    plt.ylabel('Accuracy')
    plt.legend(loc='lower right')
    plt.show()