Tensorflow基础
一、什么是Tensorflow
Tensorflow是一个符号式编程的框架。由谷歌大脑开发,2015年开源,是目前业界用的最广泛的深度学习框架之一。该框架可广泛的用于各个终端,服务器端,移动端和嵌入式端等。
一个Tensorflow程序通常包含两个部分:
- 构建计算图
- 执行计算图
下面来看一个最简单的Tensorflow程序的例子
import warnings
warnings.filterwarnings('ignore')
# 导入tensorflow包
import tensorflow as tf
# 构建计算图
g1 = tf.get_default_graph()
w = tf.constant(2.)
y = w+2
# 加载会话,执行计算图
with tf.Session(graph = g1) as sess:
print(sess.run([y]))
# 清空图
tf.reset_default_graph()
[4.0]
二、Tensorflow中的基本概念
Tensorflow中有一些基本概念必须掌握清楚:
-
图(graph)
-
会话(session)
-
操作(op)
-
张量(tensor)
-
变量(variable)
-
占位符(placeholder)
-
计算路径
-
tf.assgin
2.1 什么是图(graph)
图由节点和边组成,需要注意的是这个图的概念和和理论上的计算图不一样。在Tensorflow中边表示流动的方向,节点表示张量和操作。张量和操作的概念在后面会进一步讲解。(课上讲的计算图,边表示操作,节点表示变量)
注意:计算图只包含操作,不包含结果(没有实际的运算过程)
当你打开Tensorflow的时候,tf会自动为你分配一个默认的图。你所有构件图的操作都会在这个默认的图上进行操作。
# 默认的图上进行操作
g0 = tf.get_default_graph()
# 这是图的一个构件
x0 = tf.Variable(1)
# 查看这个图中的构件属不属于这个图
x0.graph is g0
WARNING:tensorflow:From D:\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
True
# 在不同的图上进行不同的操作
g1 = tf.Graph()
g2 = tf.Graph()
# 在g1这个图上进行操作
with g1.as_default():
x1 = tf.Variable(1)
# 在g2这个图上进行操作
with g2.as_default():
x2 = tf.Variable(1)
print(x1.graph is g2)
print(x2.graph is g2)
False
True
有时候需要查看一个图上有哪些操作节点:
# 查看某一个图上的操作节点
g1.get_operations()
[<tf.Operation 'Variable/initial_value' type=Const>,
<tf.Operation 'Variable' type=VariableV2>,
<tf.Operation 'Variable/Assign' type=Assign>,
<tf.Operation 'Variable/read' type=Identity>]
在构建图的时候,由于重复构建操作导致图出错,所以在构建图的时候一定记得对默认的图进行清空。
g_now = tf.get_default_graph()
g_now.get_operations()
[<tf.Operation 'Variable/initial_value' type=Const>,
<tf.Operation 'Variable' type=VariableV2>,
<tf.Operation 'Variable/Assign' type=Assign>,
<tf.Operation 'Variable/read' type=Identity>]
tf.reset_default_graph()
g_now = tf.get_default_graph()
g_now.get_operations()
[]
扩展
Tensorflow是一种静态图的深度学习框架,Pytorch是一种动态图的深度学习框架。Tensorflow2.0引入了动态图机制,未来的框架是即可静态图也可动态图,两者可相互切换。
2.2 什么是会话(session)
会话的作用是处理内容和优化,使我们能够实际执行计算图指定的计算。
计算图是要执行的计算模版,会话通过分配计算资源来执行计算图的计算。
图的构建是几乎不占资源的,但是会话会占用很多资源
来看一个简单的例子
# 构建图(在默认的图上构建)
w = tf.constant(3)
x = w+2
y = x+5
z = x*3
# 执行会话
with tf.Session(graph=tf.get_default_graph()) as sess:
print(sess.run([x]))
print(sess.run([z]))
print(x.eval())
[5]
[15]
5
注意:
- eval( )等价于sess.run( )。
- tensorflow会自动检测依赖关系。
- 除了variable变量其余计算结果每次计算完后会释放。variable在session执行完后释放。
- 重复计算的问题。
# 解决重复计算的问题
# 构建图
w = tf.constant(3)
x = w+2
y = x+5
z = x*3
# 执行会话
with tf.Session(graph=tf.get_default_graph()) as sess:
print(sess.run([x,z]))
#print(sess.run[x])
#print(sess.run(z))
[5, 15]
每次使用上下文管理器太麻烦,于是我们有互动的会话,互动的会话就像ipython一样,实时反馈。实时反馈的目的是为了简单进行调动
# 注意需要手动关闭
sess = tf.InteractiveSession()
print(x.eval())
print(sess.run(x))
5
5
sess.close()
# print(sess.run([x]))
2.3 什么是张量(tensor)
操作的输入和输出就是张量,Tensorflow直观翻译,就是张量流动的意思。
- 标量(scalar)
- 向量(vector)
- 矩阵(matrix)
- 张量(tensor)
在tensorflow中张量主要有三个来源:
- constant
- variable
- placeholder
这里我们只讨论contant,后面两种在后面小节会详细讨论。
constant
的生存周期在会话内。
tf.constant
生成常量的意思(常量意味着不可变)。
import numpy as np
a = tf.constant(np.arange(12).reshape(3,4),dtype=tf.float32)
with tf.Session(graph=tf.get_default_graph()) as sess:
print(sess.run([a]))
tf.reset_default_graph()
[array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]], dtype=float32)]
a = tf.constant(0.0,shape=(3,2,6),dtype=tf.float32)
with tf.Session() as sess:
print(sess.run([a]))
tf.reset_default_graph()
[array([[[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]]], dtype=float32)]
2.4 什么是变量(variable)
在神经网络中,我们需要一种能充当神经网络参数的角色。他可以被保存,也可被更新。这种角色称之为变量。变量也是一种张量。variable这种张量在session中会一直保持,不会被释放,可以被改变,直到session被关闭。
注意:变量一定需要初始化。(神经网络参数也需要初始化!)
变量的初始化主要使用下面的代码,这样就不需要一个一个的初始化了。
# 这个实际上是一个op包含了所有的变量。
......
init = tf.global_variables_initializer()
......
sess.run(init)
- 其中
init
构建在图中。 - sess.run(init)在session中执行。
在Tensorflow中主要通过tf.Variable
和tf.get_variable
两个接口来实现变量。两者有很大的区别,建议大家尽量使用tf.get_variable
tf.Variable的使用
1. 每次调用得到的都是不同的变量,即使使用了相同的变量名,在底层实现的时候还是会为变量创建不同的别名。
var1 = tf.Variable(tf.random_uniform([1], -1.0, 1.0),name='var',dtype=tf.float32)
var2 = tf.Variable(initial_value=[2],name='var',dtype=tf.float32)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print(var1.name, sess.run(var1))
print(var2.name, sess.run(var2))
var:0 [0.85226274]
var_1:0 [2.]
2. 会受tf.name_scope
环境的影响,即会在前面加上name_scope
的空间前缀。
tf.reset_default_graph()
with tf.name_scope('var_b_scope'):
var1 = tf.Variable(name='var', initial_value=[2], dtype=tf.float32)
var2 = tf.Variable(name='var', initial_value=[2], dtype=tf.float32)
with tf.name_scope('var_a_scope'):
var3 = tf.Variable(name='var', initial_value=[2], dtype=tf.float32)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print(var1.name, sess.run(var1))
print(var2.name, sess.run(var2))
print(var3.name, sess.run(var3))
var_b_scope/var:0 [2.]
var_b_scope/var_1:0 [2.]
var_a_scope/var:0 [2.]
3. Variable()
创建时直接指定初始化的方式,还可以把其他变量的初始值作为初始值。
var2 = tf.Variable(var1.initialized_value())
tf.get_variable的使用
1. 只会创建一个同名变量,如果想共享变量,需指定reuse=True
,否则多次创建会报错,使用reuse=True
(第一次创建的时候不用,后面共享的时候声明),可以动态的修改某个scope
的共享属性。
tf.reset_default_graph()
def func(x):
weight = tf.get_variable(name = "weight",initializer = tf.random_normal([1]))
bias = tf.get_variable(name="bias",initializer = tf.zeros([1]))
return tf.add(tf.multiply(weight, x), bias)
result1 = func(1)
#result2 = func(2)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print(sess.run([result1]))
# print(sess.run(result2))
[array([0.12032716], dtype=float32)]
tf.reset_default_graph()
def func(x,reuse):
with tf.variable_scope('neuron',reuse=reuse):
weight = tf.get_variable(name = "weight",initializer = tf.random_normal([1]))
bias = tf.get_variable(name="bias",initializer = tf.zeros([1]))
return tf.add(tf.multiply(weight, x), bias)
result1 = func(1,reuse=False)
result2 = func(2,reuse=True)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print(sess.run([result1]))
print(sess.run([result2]))
[array([0.14011075], dtype=float32)]
[array([0.2802215], dtype=float32)]
2. 不受with tf.name_scope
的影响(注:是name_scope
,不是variable_scope
,tf.Variable
和tf.get_variable
都会受variable_scope
影响))
tf.reset_default_graph()
with tf.name_scope('var_a_scope'):
var1 = tf.get_variable(name='var', shape=[1], dtype=tf.float32)
var2 = tf.get_variable(name='var1', shape=[1], dtype=tf.float32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(var1.name, sess.run(var1))
print(var2.name, sess.run(var2))
var:0 [-0.96074855]
var1:0 [0.46584237]
3. 初始化方法
conv1_weights = tf.get_variable(name="conv1_weights", shape=[5, 5, 3, 3], dtype=tf.float32, initializer=tf.truncated_normal_initializer())
conv1_biases = tf.get_variable(name='conv1_biases', shape=[3], dtype=tf.float32, initializer=tf.zeros_initializer())
4. with tf.variable_scope('scope_name")
会进行“累加”,每调用一次就会给里面的所有变量添加一次前缀,叠加顺序是外层先调用的在前,后调用的在后
tf.reset_default_graph()
def my_image_filter(input_images):
with tf.variable_scope('scope_a'):
conv1_weights = tf.get_variable(name="conv1_weights", shape=[5, 5, 3, 3], dtype=tf.float32, \
initializer=tf.truncated_normal_initializer())
conv1_biases = tf.get_variable(name='conv1_biases', shape=[3], dtype=tf.float32, \
initializer=tf.zeros_initializer())
conv1 = tf.nn.conv2d(input_images, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
print(conv1_weights.name)
return tf.nn.relu(conv1 + conv1_biases)
image1 = np.random.random(3*5*5).reshape(1, 5, 5, 3).astype(np.float32)
image2 = np.random.random(3*5*5).reshape(1, 5, 5, 3).astype(np.float32)
with tf.variable_scope("image_filters") as scope:
result1 = my_image_filter(image1)
print(result1.name)
image_filters/scope_a/conv1_weights:0
image_filters/scope_a/Relu:0
2.5 什么是占位符(placeholder)
我们需要让计算图能接受外面来的数据,如何接受外面的数据就是通过占位符实现的。简单点理解就是神经网络需要输入的数据就是由占位符来输入的。为什么叫占位符,因为BatchSize
是占了个位置,占好位置后,输入的数据在不断变化。
注意:占位符也是一种
tensor
。输入的数据我们一般输入numpy
的ndarray
。
import numpy as np
ph = tf.placeholder(dtype=tf.float32,shape =(None,3))
add_op = tf.add(ph,1)
with tf.Session() as sess:
#print(sess.run(ph,feed_dict={ph:np.random.rand(4,3)}))
print(sess.run(add_op,feed_dict={ph:np.random.rand(5,3)}))
[[1.6978512 1.4810288 1.4138539]
[1.0948298 1.2795463 1.7776579]
[1.8180617 1.4552476 1.1511333]
[1.1958804 1.0192685 1.2253966]
[1.0613252 1.0539216 1.3431617]]
2.6 什么是计算路径
计算路径是指如果计算的节点具有依赖关系,那么我们就会计算这些节点,沿着父节点找。
TensorFlow仅通过必需的节点自动进行计算这一事实是该框架的一个巨大优势。如果计算图非常大并且有许多不必要的节点,那么它可以节省大量调用的运行时间。它允许我们构建大型的多用途计算图,这些计算图使用单个共享的核心节点集合,并根据所采取的不同计算路径去做不同的事情
# tensorflow会自动寻找依赖关系
# 如果去掉feed_dict会报错
tf.reset_default_graph()
ph = tf.placeholder(tf.int32)
three_node = tf.constant(3)
sum_node = ph + three_node
with tf.Session() as sess:
#print(sess.run(three_node))
print(sess.run(sum_node,feed_dict={ph:15}))
18
2.7、一个重要的操作tf.assign
tf.assign(target, value)
表示把value
值赋值给target
。target
必须是一个可变的tensor
(variable)可以没被初始化。value
必须要有和target
相同的数据类型和形状。
思考一下如下的操作需要用到tf.assign
吗?如果要用,对谁用?
θ = θ − β ∇ L ( θ ) \theta = \theta - \beta \nabla L(\theta) θ=θ−β∇L(θ)
三、实现线性回归
线性回归可以看作是最简单的神经网络。我们使用4种方法来实现一个线性回归。
- 解析法。
- 人工求梯度。
- 使用低阶API求梯度。
- 使用高阶API求梯度。
import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)
with tf.Session() as sess:
theta_value = theta.eval()
print(theta_value)
Downloading Cal. housing from https://ndownloader.figshare.com/files/5976036 to C:\Users\小金毛\scikit_learn_data
[[-3.6959320e+01]
[ 4.3698898e-01]
[ 9.4245886e-03]
[-1.0791138e-01]
[ 6.4842808e-01]
[-3.9986235e-06]
[-3.7866351e-03]
[-4.2142656e-01]
[-4.3467718e-01]]
import time
tf.reset_default_graph()
n_epochs = 1000
learning_rate = 0.01
data = housing.data
scaled_housing_data_plus_bias = (data-np.mean(data,axis=0))/np.std(data,axis=0)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data_plus_bias]
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
gradients = 2./m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
if epoch%100==0:
print("Epoch", epoch, "MSE =", mse.eval())
time.sleep(1)
sess.run(training_op)
best_theta = theta.eval()
print(best_theta)
Epoch 0 MSE = 3.2933912
Epoch 100 MSE = 0.7488009
Epoch 200 MSE = 0.6507948
Epoch 300 MSE = 0.61479896
Epoch 400 MSE = 0.58986986
Epoch 500 MSE = 0.5718851
Epoch 600 MSE = 0.5588681
Epoch 700 MSE = 0.5494392
Epoch 800 MSE = 0.5426048
Epoch 900 MSE = 0.5376474
[[ 2.0685523 ]
[ 0.8070116 ]
[ 0.15161629]
[-0.15341702]
[ 0.18247263]
[ 0.00774657]
[-0.04162445]
[-0.6815473 ]
[-0.64621437]]
tf.reset_default_graph()
n_epochs = 1000
learning_rate = 0.01
data = housing.data
scaled_housing_data_plus_bias = (data-np.mean(data,axis=0))/np.std(data,axis=0)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data_plus_bias]
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
gradients = tf.gradients(mse,theta)
training_op = tf.assign(theta, theta - learning_rate * gradients[0])
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
if epoch%100==0:
print("Epoch", epoch, "MSE =", mse.eval())
time.sleep(1)
sess.run(training_op)
best_theta = theta.eval()
print(best_theta)
Epoch 0 MSE = 7.6010137
Epoch 100 MSE = 0.6682596
Epoch 200 MSE = 0.55786854
Epoch 300 MSE = 0.5489261
Epoch 400 MSE = 0.54366136
Epoch 500 MSE = 0.5396379
Epoch 600 MSE = 0.5365099
Epoch 700 MSE = 0.53406346
Epoch 800 MSE = 0.5321404
Epoch 900 MSE = 0.5306212
[[ 2.0685523 ]
[ 0.88347405]
[ 0.14116442]
[-0.34424198]
[ 0.36065626]
[ 0.00295106]
[-0.04237042]
[-0.6859274 ]
[-0.6617364 ]]
tf.reset_default_graph()
n_epochs = 1000
learning_rate = 0.01
data = housing.data
scaled_housing_data_plus_bias = (data-np.mean(data,axis=0))/np.std(data,axis=0)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data_plus_bias]
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
#optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate,momentum=0.9)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
if epoch%100==0:
print("Epoch", epoch, "MSE =", mse.eval())
time.sleep(1)
sess.run(training_op)
best_theta = theta.eval()
print(best_theta)
Epoch 0 MSE = 8.72543
Epoch 100 MSE = 0.5405436
Epoch 200 MSE = 0.5251301
Epoch 300 MSE = 0.52440524
Epoch 400 MSE = 0.52433175
Epoch 500 MSE = 0.5243224
Epoch 600 MSE = 0.52432114
Epoch 700 MSE = 0.524321
Epoch 800 MSE = 0.524321
Epoch 900 MSE = 0.524321
[[ 2.0685577 ]
[ 0.82962817]
[ 0.11875326]
[-0.26554358]
[ 0.30571008]
[-0.00450252]
[-0.0393266 ]
[-0.89986557]
[-0.8705218 ]]
四、保存和恢复模型
在模型的参数学习过程中,我们需要根据情况保存模型。根据前面的讲解知道神经网络的参数存储在variable
中,variable
的参数在session
关闭后就会释放。所以我们需要在session
打开的时候保存模型的参数。
保存模型参数类似于checkpoint
(切片快照)。在迭代的过程中,选择某一次快照一下然后保存到硬盘中。
注意:模型保存在硬盘中有四个文件
tf.reset_default_graph()
theta = tf.Variable(tf.random_uniform([3, 1], -1.0, 1.0), name="theta")
init = tf.global_variables_initializer()
saver = tf.train.Saver()
n_epochs = 1000
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
# checkpoint every 100 epochs
if epoch % 100 == 0:
saver.save(sess, save_path="./model/my_model.ckpt")
print(theta.eval())
[[-0.0147934 ]
[ 0.38159657]
[ 0.25122333]]
我们来查看一下保存的文件:
文件中主要保存了两类东西:
- 计算图(保存在meta文件中)
- variable的参数(保存在data文件中)
所以我们从硬盘中加载回模型有两种方法:
- 复制之前的代码,生成一摸一样的计算图,然后加载参数。
- 加载meta文件,将计算图加载回来,然后加载参数。
# 方法一
tf.reset_default_graph()
theta = tf.Variable(tf.random_uniform([3, 1], -1.0, 1.0), name="theta")
init = tf.global_variables_initializer()
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess,save_path="./model/my_model.ckpt")
print(theta.eval())
INFO:tensorflow:Restoring parameters from ./model/my_model.ckpt
[[-0.0147934 ]
[ 0.38159657]
[ 0.25122333]]
# 方法二
tf.reset_default_graph()
saver = tf.train.import_meta_graph('./model/my_model.ckpt.meta')
with tf.Session() as sess:
saver.restore(sess,'./model/my_model.ckpt')
INFO:tensorflow:Restoring parameters from ./model/my_model.ckpt
4.1 tf.get_collection和tf.add_to_collection
为了方便我们取出不同的operation,我们需要使用tf.add_to_collection和tf.get_collection。
tf.reset_default_graph()
theta = tf.Variable(tf.random_uniform([3, 1], -1.0, 1.0), name="theta")
tf.add_to_collection('my_op',theta)
init = tf.global_variables_initializer()
saver = tf.train.Saver()
n_epochs = 1000
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
# checkpoint every 100 epochs
if epoch % 100 == 0:
saver.save(sess, save_path="./model/my_model.ckpt")
print(theta.eval())
[[-0.52151966]
[-0.21205497]
[-0.6032307 ]]
tf.reset_default_graph()
saver = tf.train.import_meta_graph('./model/my_model.ckpt.meta')
my_op = tf.get_collection('my_op')
with tf.Session() as sess:
saver.restore(sess,'./model/my_model.ckpt')
print(sess.run(my_op[0]))
INFO:tensorflow:Restoring parameters from ./model/my_model.ckpt
[[-0.52151966]
[-0.21205497]
[-0.6032307 ]]
my_op
[<tf.Tensor 'theta:0' shape=(3, 1) dtype=float32_ref>]
五、TensorBoard监控
TensorBoard是和TensorFlow配套的一个神经网络可视化的工具。
大致流程如下:
- 在你创建的图里面,选择你要汇总(summary)的节点。
- 因为你要对每一个汇总操作,进行sess.run操作,为了方便所以我们需要将所有操作进行汇总。(tf.summary.merge_all())。
- 在sess中运行上面汇总的操作。
- 使用tf.summary.FileWriter,将结果写入文件。
- 使用tensorboard --logdir='path’运行文件。
5.1 summary操作
#统计标量,比如loss,accuracy,得到时序图
tf.summary.scalar(name,tensor)
#统计张量,直方图统计,看weights,bias的分布
tf.summary.histogram(name,tensor)
# 将summary的操作进行汇总
# inputs是个list
# 一个表示部分汇总一个表示全部汇总
merge_some=tf.summary.merge(inputs,collections=None,name=None)
merge_summary=tf.summary.merge_all(key=tf.GraphKeys.SUMMARIES)
注意:上面的操作都是在图的定义中
5.2 文件写入操作
#写入到硬盘的文件
#这个操作把图写入文件中
file_writer=tf.summary.FileWriter(logdir,graph,flush_secs)
merge=sess.run(merge_some)
file_writer.add_summary(merge,step)
注意,该操作是在sess会话中运行
[...]
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
step = epoch * n_batches + batch_index
file_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
[...]
5.3 运行tensorboard
#path为目录地址
tensorboard --logdir='path'
#关于端口被占用的解决方法
#默认使用的是6006端口
lsof -i:6006
kill -9 4969
#在浏览器中输入
http://0.0.0.0:6006/ (or http://localhost:6006/)
6006倒过来就是goog的意思
# tensorboard举例
import tensorflow as tf
import numpy as np
from datetime import datetime
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
m, n = housing.data.shape
tf.reset_default_graph()
n_epochs = 1000
learning_rate = 0.01
data = housing.data
scaled_housing_data_plus_bias = (data-np.mean(data,axis=0))/np.std(data,axis=0)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data_plus_bias]
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
# 定义写入的东西
tf.summary.scalar('mse',mse)
tf.summary.histogram('theta',theta)
# 进行汇总
merge_summary=tf.summary.merge_all(key=tf.GraphKeys.SUMMARIES)
# 定义写入地址
now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)
# 打印写入的地址方便tensorboard使用
print(logdir)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
# 定义写的文件(打开)
file_writer=tf.summary.FileWriter(logdir,sess.graph)
for epoch in range(n_epochs):
if epoch%100==0:
print("Epoch", epoch, "MSE =", mse.eval())
sess.run(training_op)
# 计算写入的值
summary_str = merge_summary.eval()
file_writer.add_summary(summary_str, epoch)
best_theta = theta.eval()
print(best_theta)
file_writer.close()
tf_logs/run-20190914065748/
Epoch 0 MSE = 6.463257
Epoch 100 MSE = 0.64500874
Epoch 200 MSE = 0.583449
Epoch 300 MSE = 0.5705363
Epoch 400 MSE = 0.56115687
Epoch 500 MSE = 0.55382764
Epoch 600 MSE = 0.54805243
Epoch 700 MSE = 0.54347855
Epoch 800 MSE = 0.53983927
Epoch 900 MSE = 0.53693074
[[ 2.0685523 ]
[ 0.9130923 ]
[ 0.14939232]
[-0.39519867]
[ 0.40059316]
[ 0.00555986]
[-0.04369798]
[-0.59813654]
[-0.5771508 ]]
六、 实现第一个神经网络
任务:使用mnist数据集来实现图像的分类:
输入是以下的一张图片:
等价于一个矩阵:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import sys
tf.reset_default_graph()
epochs = 15
batch_size = 100
total_sum = 0
epoch = 0
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
train_num = mnist.train.num_examples
input_data = tf.placeholder(tf.float32,shape=(None,784))
input_label = tf.placeholder(tf.float32,shape=(None,10))
w1 = tf.get_variable(shape=(784,64),name='hidden_1_w')
b1 = tf.get_variable(shape=(64),initializer=tf.zeros_initializer(),name='hidden_1_b')
w2 = tf.get_variable(shape=(64,32),name='hidden_2_w')
b2 = tf.get_variable(shape=(32),initializer=tf.zeros_initializer(),name='hidden_2_b')
w3 = tf.get_variable(shape=(32,10),name='layer_output')
#logit层
output = tf.matmul(tf.nn.relu(tf.matmul(tf.nn.relu(tf.matmul(input_data,w1)+b1),w2)+b2),w3)
loss = tf.losses.softmax_cross_entropy(input_label,output)
#opt = tf.train.GradientDescentOptimizer(learning_rate=0.1)
opt = tf.train.AdamOptimizer()
train_op = opt.minimize(loss)
# 测试评估
correct_pred = tf.equal(tf.argmax(input_label,axis=1),tf.argmax(output,axis=1))
acc = tf.reduce_mean(tf.cast(correct_pred,tf.float32))
tf.add_to_collection('my_op',input_data)
tf.add_to_collection('my_op',output)
tf.add_to_collection('my_op',loss)
init = tf.global_variables_initializer()
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run([init])
test_data = mnist.test.images
test_label = mnist.test.labels
while epoch<epochs:
data,label=mnist.train.next_batch(batch_size)
data = data.reshape(-1,784)
total_sum+=batch_size
sess.run([train_op],feed_dict={input_data:data,input_label:label})
if total_sum//train_num>epoch:
epoch = total_sum//train_num
loss_val = sess.run([loss],feed_dict={input_data:data,input_label:label})
acc_test = sess.run([acc],feed_dict={input_data:test_data,input_label:test_label})
saver.save(sess, save_path="./model/my_model.ckpt")
print('epoch:{},train_loss:{:.4f},test_acc:{:.4f}'.format(epoch,loss_val[0],acc_test[0]))
WARNING:tensorflow:From <ipython-input-41-0eb8b311b19f>:12: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
WARNING:tensorflow:From D:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please write your own downloading logic.
WARNING:tensorflow:From D:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data\train-images-idx3-ubyte.gz
WARNING:tensorflow:From D:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:267: extract_labels (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data\train-labels-idx1-ubyte.gz
WARNING:tensorflow:From D:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:110: dense_to_one_hot (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz
WARNING:tensorflow:From D:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
WARNING:tensorflow:From D:\Anaconda3\lib\site-packages\tensorflow\python\ops\losses\losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
epoch:1,train_loss:0.1884,test_acc:0.9359
epoch:2,train_loss:0.2500,test_acc:0.9569
epoch:3,train_loss:0.0915,test_acc:0.9633
epoch:4,train_loss:0.0659,test_acc:0.9684
epoch:5,train_loss:0.0868,test_acc:0.9709
epoch:6,train_loss:0.0569,test_acc:0.9723
epoch:7,train_loss:0.0746,test_acc:0.9722
epoch:8,train_loss:0.0403,test_acc:0.9737
epoch:9,train_loss:0.0406,test_acc:0.9746
epoch:10,train_loss:0.0288,test_acc:0.9711
epoch:11,train_loss:0.0054,test_acc:0.9717
epoch:12,train_loss:0.0241,test_acc:0.9745
epoch:13,train_loss:0.0478,test_acc:0.9731
epoch:14,train_loss:0.0061,test_acc:0.9734
epoch:15,train_loss:0.0157,test_acc:0.9731
from matplotlib import pyplot as plt
%matplotlib inline
index = 666
plt.imshow(test_data[index].reshape(28,28),cmap='gray')
<matplotlib.image.AxesImage at 0x27199b6f400>
tf.reset_default_graph()
sess = tf.InteractiveSession()
saver = tf.train.import_meta_graph('./model/my_model.ckpt.meta')
saver.restore(sess,"./model/my_model.ckpt")
input_tensor = tf.get_collection('my_op')[0]
output_tensor = tf.get_collection('my_op')[1]
INFO:tensorflow:Restoring parameters from ./model/my_model.ckpt
np.argmax(sess.run(output_tensor,feed_dict={input_tensor:np.expand_dims(test_data[index],axis=0)}))
7