TensorFlow学习笔记（2）----Softmax Regression分类MNIST

最新推荐文章于 2023-12-26 08:16:23 发布

PhDat101

最新推荐文章于 2023-12-26 08:16:23 发布

阅读量2.8k

点赞数 1

分类专栏： TensorFlow 文章标签： TensorFlow python 神经网络

本文链接：https://blog.csdn.net/PhDat101/article/details/52397284

版权

TensorFlow 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

1.简要介绍Softmax Regression

直观地看，每个像素乘个系数加个偏置，最后得到一个属于某个分类的比重，表示“支持图片是数字i的证据有多强烈“

上图红色代表负数贡献，蓝色代表正贡献。具体的公式就是：

x表示图像，w是系数，b是偏置系数。softmax会把每个分类的证据都求解出来，然后使用指数归一化：

这里的x又表示获得的证据evidence数值。简单表示为：

softmax在本例分类其中的位置和表示：

表示成矩阵的形式：

，

抽象化一下：

实际运行中，每个图片，这个公式输出一个向量，表示这个图片是对应数字的“强度/可能性“。

2.如何训练

使用常用的交叉熵最为代价函数（这个代价函数一般是个非线性函数，有很多种，交叉熵是比较常用的有一些良好的性质）

y是网络最终的计算值，y‘是实际值。训练的时候需要最小化这个值。

最终分类器使用的MNIST数据集：数据部分是［60000，784］的数组，标签部分是［60000，10］的数组，其中784表示28＊28的像素，10表示one-hot标签（只有一个值是1其他是0）

代码：

#download teh MNIST data in folder "MNIST_data" that in the same path as this *.py
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

import tensorflow as tf

#图片的占位
x = tf.placeholder(tf.float32, [None, 784])

#系数
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

#softmax层
y = tf.nn.softmax(tf.matmul(x, W) + b)

#用于训练的真实值占位
y_ = tf.placeholder(tf.float32, [None, 10])

#交叉熵：-tf.reduce_sum(y_ * tf.log(y)是一个样本的，外面的tf.reduce_mean是batch的
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

#规定训练的方法：注意：使用GradientDescentOptimizer适合上述的误差项
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

#初始化
init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

#训练
for i in range(5000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  #print batch_xs.shape
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

#验证，argmax(y,1)是获得y的第一个维度（即每一行）的最大值的位置
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run([accuracy,tf.shape(y)], feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

运行时间5.2s，最后准确率92%左右。

另外一种交互式的程序写法：

#download teh MNIST data in folder "MNIST_data" that in the same path as this *.py
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

import tensorflow as tf

#这里不同：这样，之后的运行不再显式地sess.run(...)
sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, [None, 784])

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

y_ = tf.placeholder(tf.float32, [None, 10])

sess.run(tf.initialize_all_variables())


y = tf.nn.softmax(tf.matmul(x, W) + b)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
 

for i in range(1000):
  batch = mnist.train.next_batch(100)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})

参考：

官方手册：https://www.tensorflow.org/versions/r0.10/get_started/index.html

中文社区：http://www.tensorfly.cn/