多元线性回归(数据是经典的Boston房价预测):
# coding=utf-8
import numpy as np
import tensorflow as tf
from sklearn.datasets import load_boston
# NumPy data
X, y = load_boston(return_X_y=True)
X = X.astype(np.float32)
y = y.astype(np.float32)
num_features = X.shape[1]
num_samples = X.shape[0]
# Hyper-parameters
num_epochs = 4000
lr = 0.000001
# Build graph
X_t = tf.placeholder(dtype=tf.float32, shape=[num_samples, num_features], name='X')
y_t = tf.placeholder(dtype=tf.float32, shape=[num_samples], name='y')
with tf.variable_scope('linear-model'):
W = tf.get_variable(name='weights', shape=[1, num_features])
b = tf.get_variable(name='bias', shape=[1])
hypothesis = tf.matmul(W, X_t, transpose_b=True) + b
cost = tf.reduce_mean(tf.square(hypothesis - y_t), name='cost')
train_op = tf.train.GradientDescentOptimizer(lr).minimize(cost)
# Run graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(num_epochs):
_, cost_val = sess.run([train_op, cost], feed_dict={X_t: X, y_t: y})
if epoch % 100 == 0:
print("Epoch:{} - Loss:{} - W:{} - b:{}".format(epoch, cost_val, sess.run(W), sess.run(b)))
Out:
...
Epoch:3800 - Loss:63.76749801635742 - W:[[-0.03220429 0.09916656 -0.50074565 0.09535795 -0.12186021 -0.17773175
0.17914923 0.62152874 0.24505684 0.00715045 -0.12978904 0.05122737
-0.58430535]] - b:[-0.73187399]
Epoch:3900 - Loss:63.66790771484375 - W:[[-0.03347948 0.09845625 -0.49918973 0.09542575 -0.12180284 -0.17654711
0.17848469 0.62154067 0.24411921 0.0072316 -0.12848529 0.05121902
-0.58528072]] - b:[-0.73174882]
使用TensorFlow的常规步骤:准备数据 -> 构建计算图 -> 喂入数据 -> 得到结果
可视化计算图
只需增加一行代码:
with tf.Session() as sess:
writer = tf.summary.FileWriter('D:\TB_DIR', sess.graph) # 就是这行
sess.run(tf.global_variables_initializer())
for epoch in range(num_epochs):
再次运行代码,就会在D:\TB_DIR
目录下看到一个以events.out.tfevents.*
开头的文件。
然后在命令行下cd到D:\TB_DIR
目录,输入:
tensorboard.exe --logdir='D:/TB_DIR'
然后在Chrome浏览器打开地址http://localhost:6006
,就会看到:
可视化变量
标量(Scalar)
像损失函数值、准确率这些都是单一的变量,需要使用tf.summary.scalar()
函数进行收集。
...
tf.summary.scalar("loss", cost) # 1:这行
merged_summary = tf.summary.merge_all() # 2:这行
# Run graph
with tf.Session() as sess:
writer = tf.summary.FileWriter('D:\TB_DIR', sess.graph)
sess.run(tf.global_variables_initializer())
for epoch in range(num_epochs):
_, cost_val, summary = sess.run([train_op, cost, merged_summary], feed_dict={X_t: X, y_t: y}) # 3:这行
if epoch % 100 == 0:
print("Epoch:{} - Loss:{} - W:{} - b:{}".format(epoch, cost_val, sess.run(W), sess.run(b)))
writer.add_summary(summary, epoch) # 4:这行
可以从图中看到Loss随Epoch的变化。
张量(Tensor)
像网络权重这些一般都是以张量形式呈现,我们需要用tf.summary.histogram()
函数来收集。
...
tf.summary.histogram('weights', W)
tf.summary.scalar("loss", cost)
...
张量在TensorBoard中由两种图形呈现:直方图(Histogram)和分布图(Distribution )。
直方图中,横坐标(X)代表张量的值;纵坐标(Y)代表Epoch,最靠近我们的是最近的;垂直坐标(Z)代表概率密度,也就是对应张量的值有多少。
分布图可以看做是上面直方图的俯视图。
在Windows平台下的一些坑:
1. 运行tensorboard命令时,一定要cd到代码里设置的目录,而且参数–logdir也要输入绝对路径。
2. 请使用谷歌浏览器。