ML（三）——卷积神经网络提高mnist的识别率_提高卷积神经网络图像识别率的函数方法-CSDN博客

本文链接：https://blog.csdn.net/zy714816/article/details/82684681

一.卷积神经网络

卷积神经网络（Convolutional Neural Network,CNN）是一种前馈神经网络，它的人工神经元可以响应一部分覆盖范围内的周围单元，对于大型图像处理有出色表现。它包括卷积层(convolutional layer)和池化层(pooling layer)。

有关卷积神经网络原理的介绍，我在网上看见一片文章写的很好，这里直接贴链接，大家需要的可以自己去看一下

https://www.cnblogs.com/charlotte77/p/7759802.html

二.代码实现卷积神经网络提升mnist的识别率

导入相关模块，读入mnist数据集，添加两个占位符

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist=input_data.read_data_sets('MNIST_data',one_hot=True)

x=tf.placeholder(tf.float32,[None,784])
y_=tf.placeholder(tf.float32,[None,10])

这里很重要的一步，是要将图片重新转换为28*28的形状第一个参数-1表示形状的

x_image=tf.reshape(x,[-1,28,28,1])

这四个参数的含义分别是批次，长，宽，通道。-1表示形状第一维的大小根据x自动确定的，最后一个1表示是黑白图片，彩色图片是3。

因为准备写两层卷积，所以接下来定义一些方法

def weight_variable(shape):
    initial=tf.truncated_normal(shape,stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial=tf.constant(0.1,shape=shape)
    return tf.Variable(initial)

def conv2d(x,W):
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

第一个生成权重的方法，返回是一个特定形状的变量并自动截断正态分布初始化；

第二个生成偏置值的方法，返回特定形状的变量并初始化为0.1；

第三个卷积方法，x是我们输入的图片，是个四维的矩阵，W是我们说的卷积核或者叫滤波器，它也是个四维的矩阵，分别是{长，宽，输入通道，输出通道}，strides设置步长，第一位和最后一位固定1，第二位是横向步长，第三位是纵向步长，最后是卷积的模式，常用的有这里用的same和valid。

第四个定义了最大池化法，ksize表示窗口的大小，第一和最后一位固定为1，中间表示几行几列

W_conv1=weight_variable([5,5,1,32])
b_conv1=bias_variable([32])
h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)
h_pool1=max_pool_2x2(h_conv1)

第一层卷积层，因为定义了32个卷积核，所以有32个偏置，之后进行卷积计算并用relu作为激活函数，最后池化

W_conv2=weight_variable([5,5,32,64])
b_conv2=bias_variable([64])
h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)
h_pool2=max_pool_2x2(h_conv2)

第二层卷积层，因为输入是上层的输出，所以是32，其他的同上

W_fc1=weight_variable([7*7*64,1024])
b_fc1=bias_variable([1024])
h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])
h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)

全连接层，28*28经过same卷积后是28*28，池化后是14*14，第二次same卷积后是14*14，池化后变成7*7，第二层中设置了64个卷积核，所以这层输入是7*7*64，-1表示未知，只是占位。

keep_prob=tf.placeholder(tf.float32)
h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)

dropout是指在深度学习网络的训练过程中，对于神经网络单元，按照一定的概率将其暂时从网络中丢弃。注意是暂时！dropout是CNN中是防止过拟合、提高效果的一个大杀器。

W_fc2=weight_variable([1024,10])
b_fc2=bias_variable([10])
y_conv=tf.matmul(h_fc1_drop,W_fc2)+b_fc2

最后再定义一层全连接层，将上层得到的1024维转换成10维，对应10种类别

cross_entropy=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_,logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

使用交叉熵函数计算误差值，利用AdamOptimizer优化器减小误差，最后就是计算准确率了

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(2001):
        #训练模型
        batch_xs,batch_ys=mnist.train.next_batch(50)
        sess.run(train_step,feed_dict={x:batch_xs,y_:batch_ys,keep_prob:0.5})

        if(i%100 == 0):
            test_acc=sess.run(accuracy,feed_dict={x:mnist.test.images,y_:mnist.test.labels,keep_prob:1.0})       
            print('Iter ' + str(i) + ', Testing Accuary= ' + str(test_acc))

创建session，初始化所有变量，设置每次传入50张图片，训练时的keep_prob为0.5，即训练时每次随机丢弃50%的神经元，最后输出准确率。

我的cpu比较垃圾，又没有gpu可用，所以我只跑了2000次，这还跑的很慢了，最后得到的准确率能到97%多，所以我觉得多跑几千次的话到98%肯定没问题，至于能不能到99%就要看造化了。

Iter 0, Testing Accuary= 0.1715
Iter 100, Testing Accuary= 0.8452
Iter 200, Testing Accuary= 0.9048
Iter 300, Testing Accuary= 0.9318
Iter 400, Testing Accuary= 0.9411
Iter 500, Testing Accuary= 0.9484
Iter 600, Testing Accuary= 0.95
Iter 700, Testing Accuary= 0.9555
Iter 800, Testing Accuary= 0.9623
Iter 900, Testing Accuary= 0.9599
Iter 1000, Testing Accuary= 0.9639
Iter 1100, Testing Accuary= 0.9669
Iter 1200, Testing Accuary= 0.9692
Iter 1300, Testing Accuary= 0.9695
Iter 1400, Testing Accuary= 0.971
Iter 1500, Testing Accuary= 0.9729
Iter 1600, Testing Accuary= 0.9723
Iter 1700, Testing Accuary= 0.9751
Iter 1800, Testing Accuary= 0.9766
Iter 1900, Testing Accuary= 0.9779
Iter 2000, Testing Accuary= 0.9773