【数据挖掘】BPNN初步应用-MNIST数据集

最新推荐文章于 2022-08-19 17:48:12 发布

o0o_-_

最新推荐文章于 2022-08-19 17:48:12 发布

阅读量827

点赞数 1

分类专栏：数据挖掘

本文链接：https://blog.csdn.net/qq_33446100/article/details/102534544

版权

数据挖掘专栏收录该内容

9 篇文章 1 订阅

订阅专栏

说在前面

操作系统：win10
python版本：3.6.3
tensorflow版本：1.8.0(gpu版)
数据集：MNIST
吐槽：咱们pycharm都没装

数据准备

MNIST数据集
- 地址
- 说明：tensorflow中有处理该数据集的API，貌似用于学习的，路径大概是酱紫的(Python36_64\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets)

处理数据

使用API

from  tensorflow.examples.tutorials.mnist  import  input_data
mnist = input_data.read_data_sets('./MNIST', one_hot = True)

自己处理
数据集格式_标签文件格式
60000个label，按照格式读取，存放在数组即可

[offset] [type]          [value]          [description] 
0000     32 bit integer  0x00000801(2049) magic number (MSB first) 
0004     32 bit integer  60000            number of items 
0008     unsigned byte   ??               label 
0009     unsigned byte   ??               label 
........ 
xxxx     unsigned byte   ??               label

数据集格式_图像文件格式
60000张图片，每张图片28x28=784个像素，每个像素值[0, 255]；按照格式读取，数组大小60000，每个元素784个像素值；

[offset] [type]          [value]          [description] 
0000     32 bit integer  0x00000803(2051) magic number 
0004     32 bit integer  60000            number of images 
0008     32 bit integer  28               number of rows 
0012     32 bit integer  28               number of columns 
0016     unsigned byte   ??               pixel 
0017     unsigned byte   ??               pixel 
........ 
xxxx     unsigned byte   ??               pixel

构建网络模型

确定输入输出大小
- 将每张图片作为输入，对应的值作为输出，每张图片28x28=784个像素，输入大小为784；输出为0-9，若使用one_hot，输出大小为10；
  one_hot，例如1，对应[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]，即1对应的位置为1，其它位置均为0
- 得到输入输出模型
```
num_classes = 10  # 输出大小
input_size = 784  # 输入大小
	
X = tf.placeholder(tf.float32, shape = [None, input_size])
Y = tf.placeholder(tf.float32, shape = [None, num_classes])
```
- 关于placeholder
  - 这里?
  - 以X为例，X相当于形参，X的维数为input_size（784），None表示数据量不确定（数据X的数目不确定）
构建权重
- 假设只有一层隐藏层，隐藏层节点数假定为90；
  
  设置多少个隐节点取决于训练样本数的多少、样本噪声的大小以及样本中蕴涵规律的复杂程度。一般来说 , 波动次数多、幅度变化大的复杂非线性函数要求网络具有较多的隐节点来增强其映射能力。
  确定最佳隐节点数的一个常用方法称为试凑法 , 可先设置较少的隐节点训练网络 , 然后逐渐增加隐节点数 , 用同一样本集进行训练 , 从中确定网络误差最小时对应的隐节点数。在用试凑法时 , 可用一些确定隐节点数的经验公式。这些公式计算出来的隐节点数只是一种粗略的估计值 , 可作为试凑法的初始值。
  下面介绍几个公式：
  (1) m=sqrt(n+l)+a
  (2) m=log2(n)
  (3) m=sqrt(nl)
  其中m为隐层节点数，n为输入层节点数，l为输出节点数，a为1-10之间的常数。
- 初始化权重
  初始值可以使用随机值，在后续训练模型时会对这些权值进行调整
```
W1 = tf.Variable(tf.random_normal ([input_size, hidden_units_size], stddev = 0.1))
B1 = tf.Variable(tf.constant (0.1), [hidden_units_size])
W2 = tf.Variable(tf.random_normal ([hidden_units_size, num_classes], stddev = 0.1))
B2 = tf.Variable(tf.constant (0.1), [num_classes])

hidden_opt = tf.matmul(X, W1) + B1 
hidden_opt = tf.nn.relu(hidden_opt) 
final_opt = tf.matmul(hidden_opt, W2) + B2  
```
  前四句定义了权重；
  变量W1是一个input_sizexhidden_units_size（784x90）的矩阵，常量B1相当于神经元模型中的θ；
  变量W2是一个hidden_units_sizexnum_classes（90x10）的矩阵，常量B1相当于神经元模型中的θ；
```
hidden_opt = tf.matmul(X, W1) + B1 
# matmul 矩阵相乘
hidden_opt = tf.nn.relu(hidden_opt) 
# 定义了激励函数（变换函数）
final_opt = tf.matmul(hidden_opt, W2) + B2  
# 矩阵相乘
```
  用矩阵来表示这个过程就是：
  $hidden\_opt= \overbrace{ \left[ \begin{matrix} x_1 \ x_2 \ ... \ x_{784} \end{matrix} \right] }^\text{X} \overbrace{ \left[ \begin{matrix} w_{1-1} \ w_{1-2} \ ... \ w_{1-90} \\ w_{2-1} \ w_{2-2} \ ... \ w_{2-90} \\ w_{3-1} \ w_{3-2} \ ... \ w_{3-90} \\ ... \ ... \ ... \ ... \\ w_{784-1} \ w_{784-2} \ ... \ w_{784-90} \\ \end{matrix} \right] }^\text{W1}+ \overbrace{ \left[ \begin{matrix} b_1 \ b_2 \ ... \ b_{90} \end{matrix} \right] }^\text{B1}$
  final_opt相似
- 确定损失函数
```
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=final_opt))
```
  交叉损失熵
  
  如上图，假设图片数字为1，经过模型计算后得到的结果为[1, 0, 0, 0, 0, 0, 0, 0, 0, 0]，而实际结果为[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]，然后计算其损失
- 确定优化器
```
opt = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
```
  使用梯度下降法来最小化损失；
  如下图，大概是调整权值来减小loss
  (但是这个优化器是如何改变这些权值的（是调整所有的Variable嘛？），俺不太懂，有大佬嘛？?)
  新增：调整见这里面的实例

训练模型

Code

# 初始化所有变量
init = tf.global_variables_initializer()

correct_prediction =tf.equal (tf.argmax (Y, 1), tf.argmax(final_opt, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))

# 创建会话
sess = tf.Session ()
# 运行初始化
sess.run (init)
# 开始迭代
for i in range (training_iterations) :
	# 取batch
    batch = mnist.train.next_batch (batch_size)
    # 获取输入以及标签
    batch_input = batch[0]
    batch_labels = batch[1]
    # 运行，将batch_input传给X，batch_labels传给Y
    training_loss = sess.run ([opt, loss], 
    			feed_dict = {X: batch_input, Y: batch_labels})
    if i % 1000 == 0 :
        train_accuracy = sess.run(accuracy, 
        		feed_dict = {X: mnist.test.images[:], Y: mnist.test.labels[:], K:0.8})
        print ("step : %d, training accuracy = %g " % (i, train_accuracy))

关于batch
例如batch_size = 100，相当于同时进行100张图片的训练？
关于tf.equal()
这里?
关于tf.reduce_mean()
这里?

调参

调参是个艰苦的过程，这里主要是调整学习率以及隐藏层节点的数目
~~不怎么会调参~~ (迭代次数20000)
~~这隐藏层节点数和预期有点差距啊~~
dropout越低，拟合速度越慢，过拟合风险越低

迭代次数	隐藏层节点数	学习率	dropout	accuracy	time/s
20000	90	0.05	none(未设置)	0.9736	94.23
20000	100	0.05	none	0.9729	94.87
20000	70	0.05	none	0.9713	93.61
20000	82	0.05	none	0.9745	97.77
20000	500	0.04	1.0	0.9744	129.99
20000	400	0.04	1.0	0.9750	127.83
20000	300	0.04	1.0	0.9757	114.97
20000	200	0.04	1.0	0.9741	109.36
20000	300	0.05	1.0	0.9771	118.52
30000	300	0.05	0.8	0.9803	174.38

结果

运行截图

可视化-TensorBoard

pip install tensorboard

通过writer生成日志

writer = tf.summary.FileWriter("./my_graph", sess.graph)
writer.close();

在cmd窗口运行

tensorboard --logdir="H:\...\my_graph"

浏览器查看

~~emmmm，这图没咋看明白~~

代码

import tensorflow as tf 
from  tensorflow.examples.tutorials.mnist  import  input_data
import numpy as np 
import time 


mnist = input_data.read_data_sets('./MNIST', one_hot = True)

num_classes = 10  
input_size = 784  
hidden_units_size = 300
batch_size = 1000
training_iterations = 20000

X = tf.placeholder(tf.float32, shape = [None, input_size])
Y = tf.placeholder(tf.float32, shape = [None, num_classes])
K = tf.placeholder(tf.float32) # dropout

W1 = tf.Variable(tf.random_normal ([input_size, hidden_units_size], stddev = 0.1))
B1 = tf.Variable(tf.constant (0.1), [hidden_units_size])
W2 = tf.Variable(tf.random_normal ([hidden_units_size, num_classes], stddev = 0.1))
B2 = tf.Variable(tf.constant (0.1), [num_classes])

hidden_opt = tf.matmul(X, W1) + B1  
hidden_opt = tf.nn.relu(hidden_opt)  
hidden_drop = tf.nn.dropout(hidden_opt, K)
final_opt = tf.matmul(hidden_drop, W2) + B2  


loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=final_opt))

opt = tf.train.GradientDescentOptimizer(0.05).minimize(loss)

init = tf.global_variables_initializer()

correct_prediction =tf.equal (tf.argmax (Y, 1), tf.argmax(final_opt, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))

start = time.clock()

sess = tf.Session ()
sess.run (init)
for i in range (training_iterations+1) :
    batch = mnist.train.next_batch (batch_size)
    batch_input = batch[0]
    batch_labels = batch[1]

    training_loss = sess.run ([opt, loss], feed_dict = {X: batch_input, Y: batch_labels, K: 0.8})
    if i % 1000 == 0 :
        # train_accuracy = accuracy.eval (session = sess, feed_dict = {X: batch_input,Y: batch_labels})
        train_accuracy = sess.run(accuracy, feed_dict = {X: mnist.test.images[:], Y: mnist.test.labels[:], K:0.8})
        print ("step : %d, training accuracy = %g " % (i, train_accuracy))

elapsed = (time.clock() - start)
print("Time used:",elapsed)

writer = tf.summary.FileWriter("./my_graph", sess.graph)

writer.close();
sess.close();

o0o_-_

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
2
评论
【数据挖掘】BPNN初步应用-MNIST数据集

目录说在前面数据准备MNIST数据集处理数据构建网络模型训练模型结果说在前面操作系统：win10python版本：3.6.3tensorflow版本：数据集：MNIST数据准备MNIST数据集地址说明：tensorflow中有处理该数据集的API，貌似用于学习的，路径大概是酱紫的(Python36_64\lib\site-packages\tensorflow\co...
复制链接

扫一扫