卷积神经网络手写数字分类

最新推荐文章于 2024-04-25 16:49:51 发布

Mr Robot

最新推荐文章于 2024-04-25 16:49:51 发布

阅读量708

点赞数 3

分类专栏：深度学习文章标签： cnn 分类 tensorflow

本文链接：https://blog.csdn.net/leva345/article/details/126175409

版权

深度学习专栏收录该内容

93 篇文章 10 订阅

订阅专栏

该博客详细介绍了如何利用TensorFlow构建一个卷积神经网络（CNN）来识别MNIST数据集的手写数字。首先，导入必要的库和MNIST数据，然后展示数据并可视化部分样本。接着，定义网络参数、卷积层、池化层、全连接层及dropout层。通过Adam优化器最小化交叉熵损失函数，并计算精度。最后，训练模型并在测试集上评估精度，同时绘制损失和精度随迭代次数的变化图。

摘要由CSDN通过智能技术生成

活动地址：CSDN21天学习挑战赛

*学习的最大理由是想摆脱平庸，早一天就多一份人生的精彩；迟一天就多一天平庸的困扰。

数据集：MNIST由60000个手写体数字的图片组成。

1．导入tensorflow、matplotlib、random和numpy。然后，导入mnist数据集并进行独热编码。请注意，TensorFlow有一些内置的库来处理MNIST，我们也会用到它们：

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import random as ran
# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

2．仔细观察一些数据有助于理解MNIST数据集。了解训练数据集中有多少张图片，测试数据集中有多少张图片。可视化一些数字，以便了解它们是如何表示的。这种输出可以对于识别手写体数字的难度有一种视觉感知，即使是对于人类来说也是如此。

def train_size(num):
    print ('Total Training Images in Dataset = ' + str(mnist.train.images.shape))
    print ('--------------------------------------------------')
    x_train = mnist.train.images[:num,:]
    print ('x_train Examples Loaded = ' + str(x_train.shape))
    y_train = mnist.train.labels[:num,:]
    print ('y_train Examples Loaded = ' + str(y_train.shape))
    print('')
    return x_train, y_train

def test_size(num):
    print ('Total Test Examples in Dataset = ' + str(mnist.test.images.shape))
    print ('--------------------------------------------------')
    x_test = mnist.test.images[:num,:]
    print ('x_test Examples Loaded = ' + str(x_test.shape))
    y_test = mnist.test.labels[:num,:]
    print ('y_test Examples Loaded = ' + str(y_test.shape))
    return x_test, y_test

def display_digit(num):
    print(y_train[num])
    label = y_train[num].argmax(axis=0)
    image = x_train[num].reshape([28,28])
    plt.title('Example: %d  Label: %d' % (num, label))
    plt.imshow(image, cmap=plt.get_cmap('gray_r'))
    plt.show()

def display_mult_flat(start, stop):
    images = x_train[start].reshape([1,784])
    for i in range(start+1,stop):
        images = np.concatenate((images, x_train[i].reshape([1,784])))
    plt.imshow(images, cmap=plt.get_cmap('gray_r'))
    plt.show()

x_train, y_train = train_size(55000)
display_digit(ran.randint(0, x_train.shape[0]))
display_mult_flat(0,400)

在这里插入图片描述
3．设置学习参数batch_size和display_step。另外，MNIST图片都是28×28像素，因此设置n_input=784，n_classes=10代表输出数字[0-9]，并且dropout概率是0.85，则：

# Parameters`在这里插入代码片`
learning_rate = 0.001
training_iters = 500
batch_size = 128
display_step = 10
# Network Parameters
n_input = 784 # MNIST data input (img shape: 28*28)
n_classes = 10 # MNIST total classes (0-9 digits)
dropout = 0.85 # Dropout, probability to keep units

4．设置TensorFlow计算图的输入。定义两个占位符来存储预测值和真实标签：

x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)

5．定义一个输入为x，权值为W，偏置为b，给定步幅的卷积层。激活函数是ReLU，padding设定为SAME模式：

def conv2d(x, W, b, strides=1):
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    return tf.nn.relu(x)

6．定义一个输入是x的maxpool层，卷积核为ksize并且padding为SAME：

def maxpool2d(x, k=2):
    return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1],
                          padding='SAME')

7．定义convnet，其构成是两个卷积层，然后是全连接层，一个dropout层，最后是输出层：

def conv_net(x, weights, biases, dropout):
    # reshape the input picture
    x = tf.reshape(x, shape=[-1, 28, 28, 1])

    # First convolution layer
    conv1 = conv2d(x, weights['wc1'], biases['bc1'])
    # Max Pooling used for downsampling
    conv1 = maxpool2d(conv1, k=2)

    # Second convolution layer
    conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
    # Max Pooling used for downsampling
    conv2 = maxpool2d(conv2, k=2)

    # Reshape conv2 output to matcht the input of fully connected layer 
    fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])

    # Fully connected layer
    fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    
    # Dropout
    fc1 = tf.nn.dropout(fc1, dropout)

    # Output the class prediction
    out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
    return out

8．定义网络层的权重和偏置。第一个conv层有一个5×5的卷积核，1个输入和32个输出。第二个conv层有一个5×5的卷积核，32个输入和64个输出。全连接层有7×7×64个输入和1024个输出，而第二层有1024个输入和10个输出对应于最后的数字数目。所有的权重和偏置用randon_normal分布完成初始化：

weights = {
    # 5x5 conv, 1 input, and 32 outputs
    'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
    # 5x5 conv, 32 inputs, and 64 outputs
    'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
    # fully connected, 7*7*64 inputs, and 1024 outputs
    'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
    # 1024 inputs, 10 outputs for class digits
    'out': tf.Variable(tf.random_normal([1024, n_classes]))
}

biases = {
    'bc1': tf.Variable(tf.random_normal([32])),
    'bc2': tf.Variable(tf.random_normal([64])),
    'bd1': tf.Variable(tf.random_normal([1024])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

9．建立一个给定权重和偏置的convnet。定义基于cross_entropy_with_logits的损失函数，并使用Adam优化器进行损失最小化。优化后，计算精度：

pred = conv_net(x, weights, biases, keep_prob)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()

10．启动计算图并迭代training_iterats次，其中每次输入batch_size个数据进行优化。请注意，用从mnist数据集分离出的mnist.train数据进行训练。每进行display_step次迭代，会计算当前的精度。最后，在2048个测试图片上计算精度，此时无dropout。

train_loss = []
train_acc = []
test_acc = []

with tf.Session() as sess:
    sess.run(init)
    step = 1
    while step <= training_iters:
        batch_x, batch_y = mnist.train.next_batch(batch_size)
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,
                                       keep_prob: dropout})
        if step % display_step == 0:
            loss_train, acc_train = sess.run([cost, accuracy], 
                                             feed_dict={x: batch_x,
                                                        y: batch_y,
                                                        keep_prob: 1.})
            print "Iter " + str(step) + ", Minibatch Loss= " + \
                  "{:.2f}".format(loss_train) + ", Training Accuracy= " + \
                  "{:.2f}".format(acc_train)
    
            # Calculate accuracy for mnist test images. 
            # Note that in this case no dropout
            acc_test = sess.run(accuracy, 
                                feed_dict={x: mnist.test.images,
                                      y: mnist.test.labels,
                                      keep_prob: 1.})
    
            print "Testing Accuracy:" + \
               "{:.2f}".format(acc_train)
    
            train_loss.append(loss_train)
            train_acc.append(acc_train)
            test_acc.append(acc_test)
            
        step += 1

画出每次迭代的Softmax损失以及训练和测试的精度：

eval_indices = range(0, training_iters, display_step)
# Plot loss over time
plt.plot(eval_indices, train_loss, 'k-')
plt.title('Softmax Loss per iteration')
plt.xlabel('Iteration')
plt.ylabel('Softmax Loss')
plt.show()
# Plot train and test accuracy
plt.plot(eval_indices, train_acc, 'k-', label='Train Set Accuracy')
plt.plot(eval_indices, test_acc, 'r--', label='Test Set Accuracy')
plt.title('Train and Test Accuracy')
plt.xlabel('Generation')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
plt.show()

在这里插入图片描述

Mr Robot

关注

3
点赞
踩
12

收藏

觉得还不错? 一键收藏
打赏
0
评论
卷积神经网络手写数字分类

第一个conv层有一个5×5的卷积核，1个输入和32个输出。第二个conv层有一个5×5的卷积核，32个输入和64个输出。全连接层有7×7×64个输入和1024个输出，而第二层有1024个输入和10个输出对应于最后的数字数目。10．启动计算图并迭代training_iterats次，其中每次输入batch_size个数据进行优化。定义基于cross_entropy_with_logits的损失函数，并使用Adam优化器进行损失最小化。5．定义一个输入为x，权值为W，偏置为b，给定步幅的卷积层。...
复制链接

扫一扫