BasicModel模型搭建
前言
在写TensorFlow实战的过程中,发现很多TensorFlow的函数,包括流程写的很不舒服,想了想,觉得还是基本功不够扎实。直接上手神经网络模型有时候会忽略本身模型之间的数学逻辑,导致写代码的时候总是写的巨丑又慢。所以这一章主要是回炉的过程,对每一个代码架构进行详细的阐述。
BasicModel
主要讲三个模型,线性模型,逻辑回归和最近邻模型。
最邻近模型
在最近邻模型中,我们使用mnist数据集作为训练。判断距离用L1距离(曼哈顿距离)。讲一下代码架构:
- 输入样本,5000个训练,200个测试
- 创建占位符,其中第二个占位符为其测试样本(因为使用测试样本求与蓄念样本之间的距离)
- 构建误差L1 Tensor,预测pred
- 搭建session
- 循环整个测试集,求出一个测试样本与其所有训练样本中距离最小的,算出它的预测label,与真实label做判断
- 如果相同,添加为准确率中
代码
#encoding:utf-8
import numpy as np
import tensorflow as tf
#import mnist data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)
#取5000个作为训练集,200个作为测试集
Xtrain,ytrain = mnist.train.next_batch(5000)
Xtest,ytest = mnist.train.next_batch(200)
#创建占位符
X = tf.placeholder(tf.float32,[None,784])
Xte = tf.placeholder(tf.float32,[784])
#L1 曼哈顿距离 绝对值相加
# 'x' is [[1, 1, 1]
# [1, 1, 1]]
# tf.reduce_sum(x) ==> 6
# tf.reduce_sum(x, 0) ==> [2, 2, 2]
# tf.reduce_sum(x, 1) ==> [3, 3]
# tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]]
# tf.reduce_sum(x, [0, 1]) ==> 6
distance = tf.reduce_sum(tf.abs(tf.add(X,tf.negative(Xte))),axis=1)
pred = tf.arg_min(distance,0)
accuracy = 0
init = tf.global_variables_initializer()
#构建图
with tf.Session() as sess:
sess.run(init)
for i in range(len(Xtest)):
#将Xtrain与test取欧氏距离,找出Xtrain中与xtest最近邻的
nn_index = sess.run(pred,feed_dict={X:Xtrain,Xte:Xtest[i,:]})
preclass = np.argmax(ytrain[nn_index])
trueclass = np.argmax(ytest[i])
print "Test",i,"Prediction",preclass,\
"True Class",trueclass
#计算准确率
if preclass == trueclass:
accuracy += 1./len(Xtest)
print "Done!"
print "Accuracy:",accuracy
线性回归
线性回归的模型很简单,
y=w∗x+b
。
这里使用MSE误差,关于误差选择,有时间我详细写一篇。
同样的,我们写一下代码架构。
- 训练数据为17个点,(x,y)
- 创建占位符
- 随机产生W 和b
- 优化使用梯度下降
- 训练时,每50个epoch打印此时的训练误差,应该是不断减少的
- 训练结束,打印此时的训练误差,W和b
- matplotlib图显示拟合函数
#encoding:utf-8
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
#random模块
rng = np.random
learning_rate = 0.01
train_epochs = 1000
display_step = 50
#训练数据
train_X = np.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = np.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
2.827,3.465,1.65,2.904,2.42,2.94,1.3])
n_samples = train_X.shape[0]
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
#设置weights,randn()产生正态分布的样本值
W = tf.Variable(rng.randn(),name="weight")
b = tf.Variable(rng.randn(),name="bias")
pred = tf.add(tf.multiply(X,W),b)
#MSE均方误差
cost = tf.reduce_sum(tf.pow(pred-Y,2))/(2*n_samples)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(train_epochs):
for (x,y) in zip(train_X,train_Y):
sess.run(optimizer,feed_dict={X:x,Y:y})
#每50epoch展示出训练误差,此时的W,b
if (epoch+1) %display_step == 0:
c = sess.run(cost,feed_dict={X:train_X,Y:train_Y})
print "Epoch:","%04d"%(epoch+1),"cost=","{:.9f}".format(c),\
"W=",sess.run(W),"b=",sess.run(b)
print "Optimisation Finished!"
train_cost = sess.run(cost,feed_dict={X:train_X,Y:train_Y})
print "train cost=",train_cost,"W=",sess.run(W),"b=",sess.run(b)
# Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()
逻辑回归
在逻辑回归中,我们使用Mnist数据集,即每个图像有784个像素,用像素点回归出此时的label,那么W就有[784,10],同样b为[10]。
y=softmax(w∗x+b)
交叉熵
H(y,yt)=Hty(y)=−∑(yti)log(yi)
代码
#encoding:utf-8
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)
learning_rata = 0.01
train_epochs = 100
batch_size = 100
display_step = 1
X = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10]) #0-9标签
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
pred = tf.nn.softmax(tf.matmul(X,W) + b)
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred),axis=1))
# cost = tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=pred)
optimizer = tf.train.GradientDescentOptimizer(learning_rata).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
#每次迭代遍历所有的训练batch
for epoch in range(train_epochs):
avg_cost = 0.
total_batch = int(mnist.train.num_examples/batch_size)
#对于一个完整数据集,遍历一个batch
for i in range(total_batch):
batch_xs,batch_ys = mnist.train.next_batch(batch_size)
_,c = sess.run([optimizer,cost],
feed_dict={X:batch_xs,y:batch_ys})
#计算误差,误差为一个batch的累计误差
avg_cost += c /total_batch
#display 打印出的是一次迭代的累计误差,必定是减少的
if (epoch+1) %display_step == 0:
print "Epoch:","%04d"%(epoch+1),"cost=","{:.9f}".format(avg_cost)
print "Optimization Finished!"
#检测模型的准确率
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print "Accuracy:",accuracy.eval({X:mnist.test.images[:3000],y:mnist.test.labels[:3000]})