TensorFlow界的hello world——MNIST手写数字识别

最新推荐文章于 2021-11-21 23:23:03 发布

zjx1_1

最新推荐文章于 2021-11-21 23:23:03 发布

阅读量465

点赞数 1

文章标签： tensorflow MNIST 神经网络人工智能

本文链接：https://blog.csdn.net/weixin_37813036/article/details/89425369

版权

一般而言，MNIST 数据集测试就是机器学习和深度学习当中的＂Hello World＂工程，几乎是所有的教程都会把它放在最开始的地方．这是因为，这个简单的工程包含了大致的机器学习流程，通过练习这个工程有助于我们加深理解深度学习的大致流程．MNIST 是一个小型的手写数字图片库，它总共有 60000 张图片，其中 50000 张训练图片，10000 张测试图片．每张图片的像素都是 28 * 28 它的官网地址： http://yann.lecun.com/exdb/mnist/

1.环境安装
工具：64位win10，python3.5+pycharm，tensorflow等库
Python3.5安装：官网下载：https://www.python.org/
IDE Pycharm：官网下载：http://www.jetbrains.com/pycharm/download/#section=windows
具体步骤自行百度~~~~~~~~~~
Tensorflow有两个版本，分别是CPU和GPU。使用GPU版本需要支持Cuda和CuDNN。GPU版本使用。本文使用的是CPU版本。直接在终端输入pip install --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-0.12.0-cp35-cp35m-win_amd64.whl即可。
2.检测是否安装tensorflow成功
小编在安装tensorflow时犯了很多错误，折腾了好久。安装后写了一段小代码运行，一堆错误警告……总结出来错误有：tensorflow和python版本不对应，需要64位python但是安装的是32位……废话不多说，运行下面这段代码测试是否成功安装tensorflow：

def hello():
    hello = tf.constant('Hello,TensorFlow')
    sess = tf.Session()
    print(sess.run(hello))
if __name__=='__main__':
    hello()

输出结果：
在这里插入图片描述
输出结果是上面，可喜可贺~~~
3.导入数据集

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data as i_d
mnist = i_d.read_data_sets('MNIST_data', one_hot=True)

直接用以上代码可以下载MNIST数据集，如果因为网络，某墙等原因无法下载，可以在官网http://yann.lecun.com/exdb/mnist/下载后导入到文件夹MNIST_data。下载下来，然后分别解压缩，可以发现其实只是 4 个 bin 文件．
Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz
分别是： 50000 张训练图片. 50000 个标签. 10000 张训练图片. 10000 个标签.可以用以下程序输出其中一张图片：

def show():
    mnist = i_d.read_data_sets('MNIST_data', one_hot=True)
    # 获取第二张图片
    image = mnist.train.images[1, :]
    # 将图像数据还原成28*28的分辨率
    image = image.reshape(28, 28)
    # 打印对应的标签
    print(mnist.train.labels[1])
    plt.figure()
    plt.imshow(image)
    plt.show()
if __name__=='__main__':
	show()

在这里插入图片描述
标签：[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
4. 神经网络总体结构概览

模型选用的是CNN,也就是神经网络。可以看出卷积是这个网络的核心。卷积层是用于提取图片特征，卷积的操作是用一个卷积矩阵（也叫卷积核）在输入矩阵上依次扫描，做矩阵相乘，得到的结果输入矩阵的某一个特征。这样讲是不太好理解，下面用图片说一下这个过程。

假设输入矩阵是这样的
在这里插入图片描述
选取如下一个卷积核

用卷积核在输入矩阵上依次扫过，做矩阵相乘的操作，就可以得到输入矩阵由这个卷积核提取的特征。

在 CNN 的术语中，3x3 的矩阵叫做“滤波器（filter）”或者“核（kernel）”或者“特征检测器（feature detector）”，通过在图像上滑动滤波器并计算点乘得到矩阵叫做“卷积特征（Convolved Feature）”或者“激活图（Activation Map）”或者“特征图（Feature Map）”。记住滤波器在原始输入图像上的作用是特征检测器。
对实际图片进行卷积操作的例子：
在这里插入图片描述
可以看到，不同的卷积核对原图像进行处理后可以得到不同的特征图像，有的卷积核能提取边缘信息，有的卷积核能提取色彩信息，有的卷积核能提取明暗特征，等等等等。不同的卷积就像不同的滤镜对不同的特征敏感度不同。
本程序中使用了两个卷积层+池化层，最后接上两个全连接层。第一层卷积使用32个3x3x1的卷积核，步长为1，边界处理方式为“SAME”（卷积的输入和输出保持相同尺寸），激发函数为Relu，后接一个2x2的池化层，方式为最大化池化；第二层卷积使用50个3x3x32的卷积核，步长为1，边界处理方式为“SAME”，激发函数为Relu，后接一个2x2的池化层，方式为最大化池化；第一层全连接层：使用1024个神经元，激发函数依然是Relu。第二层全连接层：使用10个神经元，激发函数为softmax，用于输出结果。
6.训练函数

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data as i_d
def mnist():
    # 读取数据
    mnist = i_d.read_data_sets('MNIST_data', one_hot=True)
    # 设置占位符，尺寸为样本输入和输出的尺寸
    x = tf.placeholder(tf.float32, [None, 784])
    y_ = tf.placeholder(tf.float32, [None, 10])
    x_image = tf.reshape(x, [-1, 28, 28, 1])
    def weight_variable(shape):
        initial = tf.truncated_normal(shape, stddev=0.1)
        return tf.Variable(initial)

    def bias_variable(shape):
        initial = tf.constant(0.1, shape=shape)
        return tf.Variable(initial)

    # 自定义卷积函数（后面卷积时就不用写太多）
    def conv2d(x, W):
        return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

    # 自定义池化函数
    def max_pool_2x2(x):
        return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)

    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)

    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

    keep_prob = tf.placeholder("float")
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])
    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
    cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

    saver = tf.train.Saver()  # 定义saver

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        for i in range(20000):
            batch = mnist.train.next_batch(50)
            if i % 100 == 0:
                train_accuracy = accuracy.eval(feed_dict={
                    x: batch[0], y_: batch[1], keep_prob: 1.0})
                print('step %d, training accuracy %g' % (i, train_accuracy))
            train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
        saver.save(sess, 'E:/python ese/cat_dog/1/model.ckpt')  # 模型储存位置

        print('test accuracy %g' % accuracy.eval(feed_dict={
            x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
if __name__=='__main__':
	mnist()

运行结果：
第 0步, 训练精度： 0.04
第 100步, 训练精度： 0.86
第 200步, 训练精度： 0.88
……
第 19800步, 训练精度： 1
第 19900步, 训练精度： 1
测试集准确性 0.9921
程序运行时间: 5223.802503932795
可以看到测试集准确度达到了99.21%，整个训练时间用了约87分钟
7.测试函数
自己手写或用PS，window自带画图工具写出0~9这十个数字，作为输入，如下图：
在这里插入图片描述
我的字还是不错的哈哈哈

def test():
    def imageprepare():
        im= Image.open('E:/python ese/cat_dog/num/9_1.PNG')  # 读取的图片所在路径
        plt.figure()
        plt.imshow(im)  # 显示需要识别的图片
        #print(im.size)
        im=im.resize((28,28))
        im = im.convert('L')
        #print(im.size)
        plt.ion()
        tv = list(im.getdata())
        tva = [(255 - x) * 1.0 / 255.0 for x in tv]
        return tva

    result = imageprepare()
    x = tf.placeholder(tf.float32, [None, 784])

    y_ = tf.placeholder(tf.float32, [None, 10])

    def weight_variable(shape):
        initial = tf.truncated_normal(shape, stddev=0.1)
        return tf.Variable(initial)

    def bias_variable(shape):
        initial = tf.constant(0.1, shape=shape)
        return tf.Variable(initial)

    def conv2d(x, W):
        return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

    def max_pool_2x2(x):
        return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])

    x_image = tf.reshape(x, [-1, 28, 28, 1])

    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)

    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])

    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)

    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])

    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

    keep_prob = tf.placeholder("float")
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])

    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

    cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

    saver = tf.train.Saver()

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        saver.restore(sess, "E:/python ese/cat_dog/1/model.ckpt")  # 使用模型，参数和之前的代码保持一致

        prediction = tf.argmax(y_conv, 1)
        predint = prediction.eval(feed_dict={x: [result], keep_prob: 1.0}, session=sess)

        print('识别结果:')
        print(predint[0])
    plt.ioff()
    plt.show()

在这里插入图片描述
识别数字0

识别数字1

识别数字6
老夫的人工智障儿砸终于识数了哈哈哈哈

zjx1_1

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
TensorFlow界的hello world——MNIST手写数字识别

一般而言，MNIST 数据集测试就是机器学习和深度学习当中的＂Hello World＂工程，几乎是所有的教程都会把它放在最开始的地方．这是因为，这个简单的工程包含了大致的机器学习流程，通过练习这个工程有助于我们加深理解深度学习的大致流程．MNIST 是一个小型的手写数字图片库，它总共有 60000 张图片，其中 50000 张训练图片，10000 张测试图片．每张图片的像素都是 28 * 28 它...
复制链接

扫一扫