tensorflow实现MNIST手写数字识别

最新推荐文章于 2024-07-22 11:48:26 发布

小罗卜

最新推荐文章于 2024-07-22 11:48:26 发布

阅读量551

点赞数 1

分类专栏： tensorflow编程 AI MNIST手写数字识别文章标签：深度学习

本文链接：https://blog.csdn.net/luoxueqian/article/details/108230776

版权

tensorflow编程同时被 3 个专栏收录

1 篇文章 0 订阅

订阅专栏

1 篇文章 0 订阅

订阅专栏

MNIST手写数字识别

1 篇文章 0 订阅

订阅专栏

MNIST数据集是由0-9，10个手写数字组成。训练图像有60000张，测试图像有10000张。

1、在tensorflow中可以使用python下载数据集，调用download.py。代码如下：

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot = True)

MNIST_data中数据不存在时，则自动将MNIST数据下载到该文件夹下。

one_hot表示独热码，一种类似于二进制的编码，例如0-9一共10个数，那么独热码就为10位：

0	1 0 0 0 0 0 0 0 0 0
1	0 1 0 0 0 0 0 0 0 0
2	0 0 1 0 0 0 0 0 0 0
3	0 0 0 1 0 0 0 0 0 0
4	0 0 0 0 1 0 0 0 0 0
5	0 0 0 0 0 1 0 0 0 0
6	0 0 0 0 0 0 1 0 0 0
7	0 0 0 0 0 0 0 1 0 0
8	0 0 0 0 0 0 0 0 1 0
9	0 0 0 0 0 0 0 0 0 1

2、定义卷积计算函数

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev = 0.1)
    return tf.Variable(initial)


def bias_variable(shape):
    initial = tf.constant(0.1, shape = shape)
    return tf.Variable(initial)


def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2*2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

3、构建第一层卷积

在构建卷积之前，应该将输入图片格式转换成卷积中需要的格式，如下：

x_image = tf.reshape(x, [-1, 28, 28, 1])

第一层卷积：

W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2*2(h_conv1)

W_conv1中，卷积核大小位5*5，通道数为1，个数为32个。因此卷积后得到的output大小为28*28，通道数为32。

4、构建第二层卷积

W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2*2(h_conv2)

此时，因为前一层输出为[-1, 28, 28, 32]的图片，所以卷积核的通道数应为32。

卷积核的个数等于output的通道数，input的通道数等于卷积核的通道数。

5、构建全连接层

W_fc1 = weight_variable([7, 7, 64, 1024])
b_fc1 = bias_variable([1024])
h_reshape = tf.reshape(h_pool2, [-1, 7, 7, 64])
h_fc1 = tf.nn.relu(tf.matmul(h_reshape, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

tf.placeholder为占位符，相当于C语言中的变量声明。

这层全连接层的作用是将上一层卷积层的输出变成了1024维的列向量。

6、构建第二层全连接层

再添加一层全连接，将h_fc1_drop转换成10维列向量，对应的就是10个类别的权重。

W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

此时得到的y_conv就是softMax中的Logit。

7、softmax & cross_entropy

输出的类别的权重应该进行softmax得到其概率，再计算交叉熵。Tensorflow中有tf.nn.softmax_cross_entropy_with_logits函数，同时将这两步进行了。

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))

8、定义反向传播

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

使用反向传播算法不断调节w和b参数的值，从而使得cross_entropy交叉熵的值最小，即损失最小。

9、计算准确率

predict_correct = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(predict_correct, tf.float32))

tf.equal()函数用来判断两个参数是否相等，tf.argmax()含义是取出数组中最大值的下标。例如判断5个样本是否相等，假如第1，2，3个样本相等，4，5个样本不想等，则tf.equal()得到的值为[true, true, true, false, false]。

tf.cast()则是将预测的N个样本对应的true或false值改为float32，即[1.0, 1.0, 1.0, 0.0, 0.0]。

tf.reduce_mean()则是计算数组中所有元素的平均值，相当于得到了模型的预测准确率。

10、训练

tensorflow需要创建session来进行训练。

sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())  #初始化

for train_step_num in range(30000):
    batch_x, batch_y = mnist.train.next(100)  #每次从训练集中选100个图片进行训练，即每次训练100张图片，batch=100
    train_step.run(feed_dict={x: batch_x, y_: batch_y, keep_prob=0.5})

小罗卜

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
tensorflow实现MNIST手写数字识别

MNIST数据集是由0-9，10个手写数字组成。训练图像有60000张，测试图像有10000张。1、在tensorflow中可以使用python下载数据集，调用download.py。代码如下：from tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("MNIST_data/", one_hot = True)MNIST_data中数据不存在时，则自动将MNIS
复制链接

扫一扫

专栏目录