6.全连接神经网络之手写数字(MNIST)识别(2)

最新推荐文章于 2024-08-18 12:16:22 发布

不要瞎胡闹

最新推荐文章于 2024-08-18 12:16:22 发布

阅读量286

点赞数

分类专栏：深度学习&Tensorflow 文章标签： tensorflow 神经网络深度学习

本文链接：https://blog.csdn.net/SDFJXVC/article/details/103440153

版权

深度学习&Tensorflow 专栏收录该内容

18 篇文章 3 订阅

订阅专栏

1.全连接神经网络

全连接神经网络其实很简单，就是在输入与输出层之间加了所谓的隐藏层，如图所示：
在这里插入图片描述
加多少隐藏层了随便你定，不加都可以，但准确率就会下降哦。
第一篇关于MNIST的实验就没有加入隐藏层，就是简单的神经网络，今天这篇就加入了一个隐藏层，所以可以称为全连接神经网络。

2.代码实现

2.1

以下代码就是为第一篇的代码加入一个隐藏层，并且还加入了正则化与退化学习率的应用，Dropout下次再用。
这次的测试结果就比第一篇高了，退化学习，正则化，隐藏层可以随自己喜好修改。

import tensorflow as tf
import pylab
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data/',one_hot=True)

#定义图片和标签的占位符
#None 表示张量的第一维度可以接受任意长度
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])


#定义权重及偏置值变量
W1 = tf.Variable(tf.random_normal(([784,300])))
b1 = tf.Variable(tf.constant(0.1,shape=[300]))
W2 = tf.Variable(tf.random_normal(([300,10])))
b2 = tf.Variable(tf.constant(0.1,shape=[10]))


#定义隐藏层
def hidden_layer(inputs):
    #要用激活函数
    return tf.nn.relu(tf.matmul(inputs,W1)+b1)
    
#隐藏层输出
h0 = hidden_layer(x)
#预测值，这里不用激活函数，因为等下要用tensorflow定义好的softmax交叉熵函数
pred = tf.matmul(h0,W2) + b2

#定义交叉熵
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=pred)

#定义L2正则化
regularizer = tf.contrib.layers.l2_regularizer(0.0001)
regularization = regularizer(W1) + regularizer(W2)

#定义总的损失函数
loss = tf.reduce_mean(cross_entropy) + regularization

#定义初始学习率
learning_rate = 0.9
#定义学习率的衰减
learning_rate_decay = 0.7
#退化学习率当前训练轮数
train_step = tf.placeholder(tf.int32)
#定义衰减周期，如果staircase=False,则每轮衰减，否则按衰减周期衰减。
decay_step = 3

learning_rate1 = tf.train.exponential_decay(
            learning_rate, global_step=train_step, decay_steps=decay_step, decay_rate=learning_rate_decay, staircase=True)

#定义优化器
opt = tf.train.GradientDescentOptimizer(learning_rate1).minimize(loss)

#定义参数
#总共循环训练轮数
train_epochs = 25 
#每轮循环中，每次训练的图片张数
batch_size = 100   
#将所有图片训练一轮所需要的总的训练次数
total_batch = int(mnist.train.num_examples/batch_size) 

#启动session
with tf.Session() as sess:
    
    #初始化变量，必写
    sess.run(tf.global_variables_initializer())
    
    print("以下是训练模型每轮训练误差：")
    #循环训练
    for epoch in range(train_epochs):
        #定义平均loss值
        avg_loss = 0.
        #循环所有数据
        for i in range(total_batch):
            batch_xs,batch_ys = mnist.train.next_batch(batch_size)
            _,c ,l_rate= sess.run([opt,loss,learning_rate1],feed_dict={x:batch_xs,y:batch_ys,train_step:epoch})
            
            #平均loss
            avg_loss += c / total_batch
        
        #显示每轮的结果
        print('Epoch:',epoch+1,',     loss:','{:.9f}'.format(avg_loss),',     learning_rate:','{:.3f}'.format(l_rate))
        
    print("\n训练模型结束，以下是测试模型准确率：")
    
    #以下是测试模型
    correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
    #准确率：
    accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
    print('Accuracy:',accuracy.eval({x:mnist.test.images,y:mnist.test.labels}))

在这里插入图片描述

2.2相关代码

(1)隐藏层：

W1 = tf.Variable(tf.random_normal(([784,300])))
b1 = tf.Variable(tf.constant(0.1,shape=[300]))
#定义隐藏层
def hidden_layer(inputs):
    #要用激活函数
    return tf.nn.relu(tf.matmul(inputs,W1)+b1)

(2)正则化：

#定义L2正则化
regularizer = tf.contrib.layers.l2_regularizer(0.0001)
regularization = regularizer(W1) + regularizer(W2)

(3)退化学习率：

#定义初始学习率
learning_rate = 0.9
#定义学习率的衰减
learning_rate_decay = 0.7
#退化学习率当前训练轮数
train_step = tf.placeholder(tf.int32)
#定义衰减周期，如果staircase=False,则每轮衰减，否则按衰减周期衰减。
decay_step = 3

learning_rate1 = tf.train.exponential_decay(
            learning_rate, global_step=train_step, decay_steps=decay_step, decay_rate=learning_rate_decay, staircase=True)