我们在机器学习中用数据去拟合线性模型还是比较简单的,那么在tensorflow中怎么拟合呢?其实也是非常简单的
在这里,我先构建了1000个数据点,然后用y=wx+b+c的模型去先验的给出这条线,接着把得到的数据放在tensorflow中学习,
通过随机梯度下降,最小化损失函数来逼近之前先验给出的这个直线,其中c为设置的干扰项,延缓直线过拟合,下面通过代码来解释。
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
# 随机生成1000个点,围绕在y=0.1x+0.3的直线周围
num_points = 1000
vectors_set = []
for i in range(num_points):
x1 = np.random.normal(0.0, 0.55) 把x1的数据定义为一个高斯分布的数据
y1 = x1 * 0.1 + 0.3 + np.random.normal(0.0, 0.03)
vectors_set.append([x1, y1])
# 生成一些样本
x_data = [v[0] for v in vectors_set]
y_data = [v[1] for v in vectors_set]
plt.scatter(x_data,y_data,c='r')
plt.show()
# 生成1维的W矩阵,取值是[-1,1]之间的随机数
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0), name='W')
# 生成1维的b矩阵,初始值是0
b = tf.Variable(tf.zeros([1]), name='b')
# 经过计算得出预估值y
y = W * x_data + b
# 以预估值y和实际值y_data之间的均方误差作为损失,其实tensorflow中海油其他计算lossfuntion的,用随机梯度计算均方误差还是比较方面和靠谱,我也比较推荐大家用这个
loss = tf.reduce_mean(tf.square(y - y_data), name='loss')
# 采用梯度下降法来优化参数
optimizer = tf.train.GradientDescentOptimizer(0.5)
# 训练的过程就是最小化这个误差值
train = optimizer.minimize(loss, name='train')
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
# 初始化的W和b是多少
print ("W =", sess.run(W), "b =", sess.run(b), "loss =", sess.run(loss))
# 执行20次训练
for step in range(20):
sess.run(train)
# 输出训练好的W和b
print ("W =", sess.run(W), "b =", sess.run(b), "loss =", sess.run(loss))
writer = tf.train.SummaryWriter("./tmp", sess.graph)
下面是打印输出结果:
W = [ 0.96539688] b = [ 0.] loss = 0.297884 W = [ 0.71998411] b = [ 0.28193575] loss = 0.112606 W = [ 0.54009342] b = [ 0.28695393] loss = 0.0572231 W = [ 0.41235447] b = [ 0.29063231] loss = 0.0292957 W = [ 0.32164571] b = [ 0.2932443] loss = 0.0152131 W = [ 0.25723246] b = [ 0.29509908] loss = 0.00811188 W = [ 0.21149193] b = [ 0.29641619] loss = 0.00453103 W = [ 0.17901111] b = [ 0.29735151] loss = 0.00272536 W = [ 0.15594614] b = [ 0.29801565] loss = 0.00181483 W = [ 0.13956745] b = [ 0.29848731] loss = 0.0013557 W = [ 0.12793678] b = [ 0.29882219] loss = 0.00112418 W = [ 0.11967772] b = [ 0.29906002] loss = 0.00100743 W = [ 0.11381286] b = [ 0.29922891] loss = 0.000948558 W = [ 0.10964818] b = [ 0.29934883] loss = 0.000918872 W = [ 0.10669079] b = [ 0.29943398] loss = 0.000903903 W = [ 0.10459071] b = [ 0.29949448] loss = 0.000896354 W = [ 0.10309943] b = [ 0.29953739] loss = 0.000892548 W = [ 0.10204045] b = [ 0.29956791] loss = 0.000890629 W = [ 0.10128847] b = [ 0.29958954] loss = 0.000889661 W = [ 0.10075447] b = [ 0.29960492] loss = 0.000889173 W = [ 0.10037527] b = [ 0.29961586] loss = 0.000888927
从结果里我们可以看到:loss在降低,w和b在逼近0.1和0.3很好的拟合了咱们先验假定的那条直线
plt.scatter(x_data,y_data,c='r')
plt.plot(x_data,sess.run(W)*x_data+sess.run(b))
plt.show()
可视化下这条直线: