初识tensorflow：CNN对手写数字的识别（包括模型构建、评估、测试以及tensor相关的细节阐述）

最新推荐文章于 2023-04-09 23:57:41 发布

cooooove

最新推荐文章于 2023-04-09 23:57:41 发布

阅读量288

点赞数

分类专栏： tensorflow学习笔记

本文链接：https://blog.csdn.net/qq_31652463/article/details/102683447

版权

tensorflow学习笔记专栏收录该内容

2 篇文章 0 订阅

订阅专栏

构建一个深度的卷积神经网络，在mnist数据集上实现手写数字识别。在代码模块中，详尽叙述了各个tensor的各个API的用法和参数。

其中，需要注意的几个点：

（1）关于Session和InteractiveSession的区别

https://blog.csdn.net/ligaofei521/article/details/78646268 该链接对这个问题阐述的很清楚，

但需要注意的在interactiveSession中，operation.run()方法，而tensor.eval()。

（2）关于模型评估的几个函数，以及相关的参数已经在代码中说明了。

# -*- coding: utf-8 -*-
import tensorflow as tf

import tensorflow.examples.tutorials.mnist.input_data as input_data

def load_dataSet():
    return input_data.read_data_sets("MNIST_data/", one_hot=True)

# def the weight  : shape 表示张量维度
def weight_var(shape):
    '''
        tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)
        参数:
        shape: 一维的张量，也是输出的张量。   
        mean: 正态分布的均值。     
        stddev: 正态分布的标准差。   
        dtype: 输出的类型。 
        seed: 一个整数，当设置之后，每次生成的随机数都一样。
        name: 操作的名字。
    '''
    return tf.Variable(tf.truncated_normal(shape, stddev = 0.1))

# 定义偏置变量
def bias_var(shape):
    return tf.Variable(tf.constant(0.1, shape = shape))


def conv2d(x, w):
    '''
        卷积是为了通过卷积核在图像上平移来提取特征
        参数：
        input : 输入的要做卷积的图片，要求为一个张量，shape为 [ batch, in_height, in_weight, in_channel ]，其中batch为图片的数量，in_height 为图片高度，in_weight 为图片宽度，in_channel 为图片的通道数，灰度图该值为1，彩色图为3。（也可以用其它值，但是具体含义不是很理解）
        filter： 卷积核，要求也是一个张量，shape为 [ filter_height, filter_weight, in_channel, out_channels ]，其中 filter_height 为卷积核高度，filter_weight 为卷积核宽度，in_channel 是图像通道数 ，和 input 的 in_channel 要保持一致，out_channel 是卷积核数量。
        strides： 卷积时在图像每一维的步长，这是一个一维的向量，[ 1, strides, strides, 1]，第一位和最后一位固定必须是1
        padding： string类型，值为“SAME” 和 “VALID”，表示的是卷积的形式，是否考虑边界。"SAME"是考虑边界，不足的时候用0去填充周围，"VALID"则不考虑
        use_cudnn_on_gpu： bool类型，是否使用cudnn加速，默认为true    
    '''
    return tf.nn.conv2d(x, w, strides=[1,1,1,1], padding = 'SAME')


def max_pool_2x2(x):
    '''
        池化是为了减少学习的参数，降低网络的复杂度
        参数：
        value：由data_format指定格式的4-D Tensor.
        ksize：具有4个元素的1-D整数Tensor.输入张量的每个维度的窗口大小.
        strides：具有4个元素的1-D整数Tensor.输入张量的每个维度的滑动窗口的步幅.
        padding：一个字符串,可以是'VALID'或'SAME'.填充算法.
        data_format：一个字符串.支持'NHWC','NCHW'和'NCHW_VECT_C'.
        name：操作的可选名称.      
    '''
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

def builtCNN():
    # 构建这个输入输出变量
    x = tf.placeholder("float", shape=[None, 784])
    y_ = tf.placeholder("float", shape=[None, 10])
    # 构建第一层卷积
    w_conv1 = weight_var([5,5,1,32])
    b_conv1 = bias_var([32])
    x_image = tf.reshape(x, [-1,28,28,1])
    h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)
    
    # 构建第二层卷积
    w_conv2 = weight_var([5,5,32,64])
    b_conv2 = bias_var([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1,w_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)
    
    # 构建密集连接层
    w_fc1 = weight_var([7*7*64,1024])
    b_fc1 = bias_var([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1,7*7*64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)
    
    # 构建损失函数
    keep_prob = tf.placeholder("float")
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
    
    # 输出层
    W_fc2 = weight_var([1024, 10])
    b_fc2 = bias_var([10])
    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
    
    sess = tf.InteractiveSession()
#    sess = tf.Session()
    # train and eval 
    mnist = load_dataSet()
    cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) # 对矩阵进行求和降维->交叉熵
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) # 训练的优化器是是通过最小化交叉熵
    # tf.argmax(input,axis)根据axis取值的不同返回每行或者每列最大值的索引；其中，axis = 0，表示列；axis = 1，表示行
    correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_, 1)) #比较两个矩阵或者向量相等的元素 
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) # 得到这个训练的精确度
    
    sess.run(tf.initialize_all_variables())
    for i in range(20000):
        batch = mnist.train.next_batch(50)
        if i%100 == 0:
            train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
            print ("step %d, training accuracy %g"%(i, train_accuracy))
        sess.run(train_step, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
    
    # 测试 mnnist数据，最终的训练精度为99.17%
    print ("test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
    sess.close()
    
    
if __name__ == "__main__":
    builtCNN()

以上仅供学习参考，欢迎有大佬指正问题。

cooooove

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
初识tensorflow：CNN对手写数字的识别（包括模型构建、评估、测试以及tensor相关的细节阐述）

构建一个深度的卷积神经网络，在mnist数据集上实现手写数字识别。在代码模块中，详尽叙述了各个tensor的各个API的用法和参数。其中，需要注意的几个点：（1）关于Session和InteractiveSession的区别https://blog.csdn.net/ligaofei521/article/details/78646268该链接对这个问题阐述的很清楚，但需要注意的...
复制链接

扫一扫