TensorFlow mnist数据集详解

最新推荐文章于 2024-05-21 14:45:00 发布

愤怒的Tudou

最新推荐文章于 2024-05-21 14:45:00 发布

阅读量517

点赞数 2

文章标签： TensorFlow 深度学习

本文链接：https://blog.csdn.net/aiiyouwei/article/details/80588590

版权

使用一个三层全连接网络实现手写数字识别

使用了一个三层网络，激活函数使用sigmod，最后将输出结果加入一个softmax层
爬了不少坑，总结如下：
1. 准确率低下，有可能是batch_size设置的过低，导致训练缓慢，一般mini_batch_size越大越好
2. 梯度下降法梯度的选择，当学习率过大时，会使损失函数容易从最低点走过头，导致准确率下降
3. 训练过程中，准确率一直是0.098问题
  - 刚开始一直以为是过拟合的原因，但是调整batch_size大小，修改epoch等都没有解决这个问题
  - 参考了另一位博主的回答，该博主认为手动下载的数据集要把图片像素除以255，否则正确率只有0.098，但依然没解决问题，链接如下
    http://www.cnblogs.com/medsci/p/8073377.html
  - 后来在Stack Overflow上看到了一个比较靠谱的回答，解决了疑惑链接如下：
    https://stackoverflow.com/questions/33712178/tensorflow-nan-bug?newreg=c7e31a867765444280ba3ca50b657a07
    原因如下：
    因为损失函数会因为softmax出现0的情况，从而出现了0*log(0)，导致梯度下降法失效，所以采用tf.clip_by_value(V, min, max)（函数作用：截取V使之在min和max之间），问题解决
  cross_entropy = -tf.reduce_sum(y_*tf.log(tf.clip_by_value(y,1e-10,1.0)))

# coding=utf-8

#### libraries

# third-party libraries
import tensorflow as tf

# 数据读取
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

# 每层神经网络的参数设置
def set_w(shape):
    init = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(init)

def set_b(shape):
    init = tf.zeros(shape)+0.1  # +0.1具有更好的训练效果
    return tf.Variable(init)

#### 构建图阶段
##模型
# 输入
xs = tf.placeholder('float', [None, 784])
ys = tf.placeholder('float', [None, 10])

# 隐藏层,具有30个神经元,使用sigmod激活单元
w_1 = set_w([784, 30])
b_1 = set_b([30])

# output=tf.nn.softmax(tf.matmul(xs,w_1)+b_1)

h_1 = tf.nn.sigmoid(tf.matmul(xs, w_1) + b_1)

# 输出层，使用softmax分类函数
w_2 = set_w([30, 10])
b_2 = set_b([10])
h_2= tf.nn.sigmoid(tf.matmul(h_1, w_2) + b_2)

# 使用softmax分类
y=tf.nn.softmax(h_2)

## 策略
# 优化算法——使用交叉熵损失函数
loss=tf.reduce_mean(tf.reduce_sum(-ys*tf.log(y)))
# 训练步骤——使用SGD进行训练，学习率为0.5
train = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

## 初始化所有参数
init = tf.global_variables_initializer()

#### 执行图阶段
with tf.Session() as sess:
    sess.run(init)
    # 迭代10000次
    for i in range(1000):
        batch_x, batch_y = mnist.train.next_batch(10)
        sess.run(train, feed_dict={xs: batch_x, ys: batch_y})
        if i % 50 == 0:
            ####评价训练效果
            test_x, text_y = mnist.test.images, mnist.test.labels
            y_=sess.run(y,feed_dict={xs:test_x})
            correct = tf.equal(tf.argmax(y_, 1), tf.argmax(text_y, 1))
            accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
            print(i,":",sess.run(accuracy,feed_dict={ys:text_y}))

愤怒的Tudou

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
TensorFlow mnist数据集详解

使用一个三层全连接网络实现手写数字识别使用了一个三层网络，激活函数使用sigmod，最后将输出结果加入一个softmax层爬了不少坑，总结如下：准确率低下，有可能是batch_size设置的过低，导致训练缓慢，一般mini_batch_size越大越好梯度下降法梯度的选择，当学习率过大时，会使损失函数容易从最低点走过头，导致准确率下降训练过程中，准确率一直是0.098问题刚开始...
复制链接

扫一扫