首先介绍TensorFlow,它是一个用来数值计算的大型开源软件库,非常适合大型ML。
原理:在Python中定义一个用来计算的图,TensorFlow就会将这个图用C++计算出来。
TensorFlow优点此处不予介绍。
Installation
仅针对 hands on Machine Learning这本书:https://blog.csdn.net/qq_40594395/article/details/111161561
Creating Your First Graph and Running It in a Session
构建一个计算图:
import tensorflow as tf
reset_graph()
x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x*y + y + 2
**方法一:**但这个图其实没有任何计算,仅是创建图,甚至没有初始化变量。若要执行,则需要打开一个TF对话,初始化所有变量,求职,并关闭(释放资源):
sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result = sess.run(f)
print(result)
sess.close()
**方法二:**但是这样看起来很不方便且很臃肿,改进:
with tf.Session() as sess:
x.initializer.run()
y.initializer.run()
result = f.eval()
以上代码中,with引导了一个默认会话,x.initializer.run()等价于tf.get_default_session().run(x.initializer),f.eavl()=tf.get_default_session().run(f)。
这种写法增加可读性且使会话在块结束时自动关闭。
**方法三:**除了手工为每个变量调动初始化器之外,还可以用global_variables_initializer()完成,但它并不立即初始化,而是在图中创建一个节点,这个节点在会话执行时初始所有变量:
init = tf.global_variables_initializer()
with tf.Session() as sess:
init.run()
result = f.eval()
**方法四:**还有一种做法:就是创建InteractiveSession,与常规会话区别在于它将自己设为默认会话,因此无需设置with块(但需要手动关闭会话):
sess = tf.InteractiveSession()
init.run()
result = f.eval()
print(result)
sess.close()
一个TF程序分为两部分:
- 构建计算图(构建阶段):展现ML模型与所需计算
- 执行图(执行阶段):重复执行每一步训练动作,逐步优化参数。
下面来看一个例子:
Managing Graphs
我们所创建的所有节点都会自动添加到默认图上:
reset_graph()
x1 = tf.Variable(1)
x1.graph is tf.get_default_graph()#True
如果我们先想要管理多个互不依赖的图,可以创建一个新图,然后用with临时模块设置成默认图:
graph = tf.Graph()
with graph.as_default():
x2 = tf.Variable(2)
在临时图上:x2.graph is graph
——True,不在默认图上:x2.graph is tf.get_default_graph()
——False
Jupyter和Shell中,常常会多次执行同一条命令,这样可能在同一个图上添加了很多重复节点,有两种做法:
- 重启
- 通过tf.reset_default_graph()重置默认图(推荐)
Lifecycle of a Node Value
求值某个节点时,TF自动监测该节点依赖的节点,先对这些依赖求值,举个例子:
w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3
with tf.Session() as sess:
print(y.eval()) # 10
print(z.eval()) # 15
在启动会话后,开始执行y,y依赖于x,计算x,x依赖于w,先求w,然后依次倒序返回y值;同理z,倒叙返回。这段会话中,x,w被计算了两次。
图每次执行间,所有节点都被抛弃,但变量值不会(变量值由会话维护),也就是说,变量生命周期从初始化器的执行开始,到关闭会话结束。
如果对于上述代码,你不希望y、z重复求值,那么必须告诉TF在一次图的执行过程中完成对y和z的求值:
with tf.Session() as sess:
y_val, z_val = sess.run([y, z])
print(y_val) # 10
print(z_val) # 15
单进程TF中,即使他们共享同一个计算图,多个会话之间依然不共享任何状态,而分布式TF多个对话会共享同一变量。
Linear Regression with TensorFlow
TF操作可接受任意数量输出,产生任意数量输入。输入和输出都是多维数组叫张量(tensor),tensor有其形状和类型。Python中可以用NumPy中的ndarrays表示。通常保存浮点型数据或存储字符串。
tensor可以对任意形状的数组进行计算,下列代码展示,如何操作二维数组进行加州住房数据线性回归。
import numpy as np
from sklearn.datasets import fetch_california_housing
reset_graph()
housing = fetch_california_housing()#获取数据
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]#添加额外偏移(x0=1,NumPy操作,立即执行)
#创造两个TF常量节点,还有target(housing.target为一维数组,reshape接受-1(未指定)作为参数,根据数组长和剩余维度计算)
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)#计算theta,不立即执行,只是定义图中节点,具体要等到图运行时才发生,theta使用正规方程计算
with tf.Session() as sess:#求值
theta_value = theta.eval()
NumPy版本:
X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
print(theta_numpy
Scikit-Learn版本
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))
print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])
与直接用NumPy计算相比,上述代码如果你有GPU,TensorFlow自动把计算分发到GPU上(带GPU支持的TensorFlow版本)
Implementing Gradient Descent
先手工计算梯度,再使用TF的自动微分特性自动计算梯度,最后学习TF内置的众多优化器。
当使用GD时,要先对输入特征向量归一化处理,否则训练非常慢,可以用TensorFlow、NumPY、Scikit-Learn的StandardScaler。
归一化数据:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]
Manually Computing the Gradients(手动计算梯度)
reset_graph()
n_epochs = 1000
learning_rate = 0.01
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
#tf.random_uniform()在图中创建一个节点,生成一个张量,函数根据传入的形状和值域生成随机值来填充这个张量,和NumPy的rand()相似。
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
gradients = 2/m * tf.matmul(tf.transpose(X), error)
#assign()创建一个为变量赋值的节点,实现批量梯度下降step_theta_next_step = theta - yeta*MSE(theta)
training_op = tf.assign(theta, theta - learning_rate * gradients)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):#不断执行训练步骤,100此迭代后答应MSE
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())
sess.run(training_op)
best_theta = theta.eval()
Using autodiff(使用自动微分)
对于处理深度神经网络来说,用数学方式从成本函数(MSE)计算出梯度琐碎且容易出错,可以通过符号微分自动求出偏导方程,但代码不一定高效。
TF的autodiff()可以自动高效地计算出梯度:
reset_graph()
n_epochs = 1000
learning_rate = 0.01
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
#tf.gradients()接受一个操作符(mse)和一个参数列表(theta)作为参数,创建一个操作符列表计算每个变量梯度:梯度节点将计算MSE对于theta的梯度向量
gradients = tf.gradients(mse, [theta])[0]#区别gradients = 2/m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())
sess.run(training_op)
best_theta = theta.eval()
print("Best theta:")
print(best_theta)
autodiff()适用于多个输出少量输出的场景,只需要n+1次遍历,就可以求出所有输出对于输入的偏导。
自动计算梯度的主要算法:
Using an Optimizer(使用优化器)
TensorFlow还内置了许多优化器,包括梯度下降优化器:
reset_graph()
n_epochs = 1000
learning_rate = 0.01
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
#只修改了如下两行代码
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())
sess.run(training_op)
best_theta = theta.eval()
print("Best theta:")
print(best_theta)
若要使用动量优化器(momentum optimizer)——比GradientDescentOptimizer收敛快得多,可以这样定义:
把上面两行代码换成optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9)
Feeding Data to the Training Algorithm
下面把上面代码改成Mini-batch Gradient Descent,所以需要一种每次迭代把X、y替换成下一个小批量的算法。最简单的方法时用占位符(placeholder)节点,占位符非常特别,它并不进行任何计算,而是再运行时替换输出值。TF中用来再训练中传值,如果运行时不为占位符指定值,将会得到异常。
要创建一个占位符节点,需要调用placeholder()指定输出tensor类型或形状(None代表任意尺寸)。下面创建一个占位符节点A,同时创建节点B,B=A+5,当对B求值时,用eval()传入一个feed_dict,指定A值(A必须有三列,可以有任意多行)
reset_graph()
A = tf.placeholder(tf.float32, shape=(None, 3))
B = A + 5
with tf.Session() as sess:
B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})
print(B_val_1)#[6. 7. 8.]]
print(B_val_2)#[[ 9. 10. 11.] [12. 13. 14.]]
可以输入任意操作的输入,不近几十年hi占位符,TF不会求值而是用你传的值
实现Mini-batch GB时,需要在构造阶段把X,y定义为占位符节点:
n_epochs = 1000
learning_rate = 0.01
reset_graph()
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
定义theta、MSE,梯度向量,优化器等,并初始
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
定义取小批量周期:n_epochs = 10
定义取小批量轮次:batch_size = 100 ;n_batches = int(np.ceil(m / batch_size))
定义取小批量算法:
def fetch_batch(epoch, batch_index, batch_size):
np.random.seed(epoch * n_batches + batch_index) # not shown in the book
indices = np.random.randint(m, size=batch_size) # not shown
X_batch = scaled_housing_data_plus_bias[indices] # not shown
y_batch = housing.target.reshape(-1, 1)[indices] # not shown
return X_batch, y_batch
执行:
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
best_theta = theta.eval()
求值时theta无需为X,y传值,因为theta不依赖他们。
Saving and Restoring Models
一旦训练好了模型,就得把模型参数保留到硬盘上(可在任意时刻使用这些参数或用在其他地方),另外,我们可能希望在训练过程中期保存checkpoint,可以理解为游戏中的存档,当down 机时可以恢复。
TF中,在构造期末尾(即在所有变量节点创建后),创建一个Saver节点,在执行期,调用save()方法,并传入一个会话和 checkpoint文件路径。
reset_graph()
n_epochs = 1000 # not shown in the book
learning_rate = 0.01 # not shown
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X") # not shown
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y") # not shown
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions") # not shown
error = y_pred - y # not shown
mse = tf.reduce_mean(tf.square(error), name="mse") # not shown
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) # not shown
training_op = optimizer.minimize(mse) # not shown
init = tf.global_variables_initializer()
saver = tf.train.Saver()#在所有初始化后再创造saver节点
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval()) # not shown
save_path = saver.save(sess, "/tmp/my_model.ckpt")#一定要先创造一个path of tmp
sess.run(training_op)
best_theta = theta.eval()
save_path = saver.save(sess, "/tmp/my_model_final.ckpt")
恢复模型时:再构造期末尾创建一个Saver节点,在执行期用Saver对象的restore()方法:
with tf.Session() as sess:
saver.restore(sess, "/tmp/my_model_final.ckpt")
best_theta_restored = theta.eval() # not shown in the book
Saver默认按照变量名保存恢复变量,如果想做更多控制,可以在保存或恢复时指定名称:saver = tf.train.Saver({"weights": theta})
但Saver不保存计算图本身,可以用tf.train.import_meta_graph()
来保存图结构:
reset_graph()
# notice that we start with an empty graph.
saver = tf.train.import_meta_graph("/tmp/my_model_final.ckpt.meta") # this loads the graph structure
theta = tf.get_default_graph().get_tensor_by_name("theta:0") # not shown in the book
with tf.Session() as sess:
saver.restore(sess, "/tmp/my_model_final.ckpt") # this restores the graph's state
best_theta_restored = theta.eval() # not shown in the book
Visualizing the Graph and Training Curves Using TensorBoard
现在有一个可以用一个Mini-batch GB训练线性回归模型的计算图,可以周期性地将检查点保存起来。TensorBoard(以下简称TB)可以在浏览器中将交互方式展示出来(如学习曲线),还可以查看图,发现图的错误及瓶颈。
修改程序,使其可以看到图的定义和训练状态,像MSE,写入到一个TB读取的log文件夹中,每次运行时,都需要指定一个不同目录,否则TB会整合这些状态信息,可视化将会乱掉——解决方法是用时间戳来命名这些日志文件夹。
先用tfgraphviz查看图的定义:
#Visualizing the graph
import os
os.environ["PATH"] += os.pathsep + "D:\Graphviz\bin"
import tensorflow as tf
import tfgraphviz as tfg
g = tfg.board(tf.get_default_graph())
g.view()
然后TB实现:
#Using TensorBoard
reset_graph()
from datetime import datetime
now = datetime.utcnow().strftime("%Y%m%d%H%M%S")#时间戳
#以下常规操作
n_epochs = 1000
learning_rate = 0.01
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter('E:\Annconda\log', tf.get_default_graph())#指定路径
#小批量
n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))
with tf.Session() as sess: # not shown in the book
sess.run(init) # not shown
for epoch in range(n_epochs): # not shown
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
step = epoch * n_batches + batch_index
file_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
best_theta = theta.eval()
file_writer.close()#关闭文件流
图忘了截,看书上先:
具体使用:https://blog.csdn.net/qq_40594395/article/details/111312539
Name Scopes
处理神经网络等复杂模型时,图很容易杂乱而庞大,所以可以用命名作用域来将相关节点分组,修改上述代码,将error和mse 操作定义到一个叫loss的scope中。
reset_graph()
n_epochs = 1000
learning_rate = 0.01
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
with tf.name_scope("loss") as scope:
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter('E:\Annconda\log', tf.get_default_graph())
n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
step = epoch * n_batches + batch_index
file_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
best_theta = theta.eval()
file_writer.flush()
file_writer.close()
print("Best theta:")
print(best_theta)
该作用域内每个操作都有“loss”前缀:loss/sub
在TB中,mse和error都显示在“loss”命名空间中,默认收起(图片忘了截图,勉强看下官方)。
Modularity(模块化):
如果像创建一个计算两个修正线性单元(ReLU)之和的图,ReLU会计算输入的线性函数,如果值为正,则输出,值为负,返回0:
TF建议用一个函数创建ReLU,以下代码创建了5个ReLU,并输出了他们的和:
#Modularity
reset_graph()
def relu(X):
w_shape = (int(X.get_shape()[1]), 1)
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
z = tf.add(tf.matmul(X, w), b, name="z")
return tf.maximum(z, 0., name="relu")
n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")#add_n()创建了一个计算一个tensor列表的和的操作
看下TF:file_writer = tf.summary.FileWriter("E:/Annconda/log/", tf.get_default_graph())
Anaaconda启动:tensorboard --logdir=路径#(我这里用的是E:/Annconda/log/)
创建节点时,TF会检查此名字是否存在,已经存在会加_和索引保证唯一性。
使用命名作用域,relu()放入一个域:
reset_graph()
def relu(X):
with tf.name_scope("relu"):
w_shape = (int(X.get_shape()[1]), 1) # not shown in the book
w = tf.Variable(tf.random_normal(w_shape), name="weights") # not shown
b = tf.Variable(0.0, name="bias") # not shown
z = tf.add(tf.matmul(X, w), b, name="z") # not shown
return tf.maximum(z, 0., name="max") # not shown
n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")
file_writer = tf.summary.FileWriter("E:\Annconda\log", tf.get_default_graph())
file_writer.close()
再通过tensorboard --logdir=路径#刷新一下TB。
Sharing Variables(共享变量)
如果想在图的不同组件中共享变量,最简单的做法时先创建,然后传参。举个例子,通过一个共享的阈值变量来控制所有ReLU的阈值,可以先创建这个变量,再传给relu()。
reset_graph()
def relu(X, threshold):
with tf.name_scope("relu"):
w_shape = (int(X.get_shape()[1]), 1) # not shown in the book
w = tf.Variable(tf.random_normal(w_shape), name="weights") # not shown
b = tf.Variable(0.0, name="bias") # not shown
z = tf.add(tf.matmul(X, w), b, name="z") # not shown
return tf.maximum(z, threshold, name="max")
threshold = tf.Variable(0.0, name="threshold")#设为标量
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X, threshold) for i in range(5)]#传入relu
output = tf.add_n(relus, name="output")
这当然可行,但是如果共享参数太多,一直将其作为参数到处传递会非常麻烦。有以下几种方法:
- 创建一个包含所有变量的字典,传递给每一个函数。
- 为每个模块都创造一个类(ReLU类用类变量持有共享参数)
- 调用时将共享变量作为relu()函数的一个属性。
用方法3:
#第一次传入创建为relu属性
reset_graph()
def relu(X):
with tf.name_scope("relu"):
if not hasattr(relu, "threshold"):#hasattr,检查relu是否有threshold属性,无则添加
relu.threshold = tf.Variable(0.0, name="threshold")
w_shape = int(X.get_shape()[1]), 1 # not shown in the book
w = tf.Variable(tf.random_normal(w_shape), name="weights") # not shown
b = tf.Variable(0.0, name="bias") # not shown
z = tf.add(tf.matmul(X, w), b, name="z") # not shown
return tf.maximum(z, relu.threshold, name="max")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")
TF的另一个选择(方法4),可以使代码更清晰,更模块化,使用也更频繁。如果共享变量不存在,该方法通过get_variable()创造共享变量,否则复用。期望行为通过variable_scope()一个属性来控制(创造或复用):以下代码创造一个“relu/threshold”的变量(shape=(),所以结果是标量,以0.0为初始值):
reset_graph()
with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(),
initializer=tf.constant_initializer(0.0))
若变量被get_variable创建调用过,会抛出异常,避免误操作复用。复用时需要设置变量作用域reuse = True显示实现:
with tf.variable_scope("relu", reuse=True):
threshold = tf.get_variable("threshold")
不存在时或者没有创建成功抛出异常,另一种方式:
with tf.variable_scope("relu") as scope:
scope.reuse_variables()
threshold = tf.get_variable("threshold")
reuse一旦设True,该块中不能再设为False。如果常见另外作用域,自动继承,只有通过get_variable创建的变量才能复用。
以下代码创建5个ReLU,并复用threshold:
reset_graph()
def relu(X):
with tf.variable_scope("relu", reuse=True):
threshold = tf.get_variable("threshold")
w_shape = int(X.get_shape()[1]), 1 # not shown
w = tf.Variable(tf.random_normal(w_shape), name="weights") # not shown
b = tf.Variable(0.0, name="bias") # not shown
z = tf.add(tf.matmul(X, w), b, name="z") # not shown
return tf.maximum(z, threshold, name="max")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(),
initializer=tf.constant_initializer(0.0))
relus = [relu(X) for relu_index in range(5)]
output = tf.add_n(relus, name="output")
file_writer = tf.summary.FileWriter("E:\Annconda\log", tf.get_default_graph())
file_writer.close()
#解决threshold当以在relu外,其他relus部分在内通过第一次创建threshold,后续复用,get_variable保证不用管服用还是创建
reset_graph()
def relu(X):
threshold = tf.get_variable("threshold", shape=(),
initializer=tf.constant_initializer(0.0))
w_shape = (int(X.get_shape()[1]), 1) # not shown in the book
w = tf.Variable(tf.random_normal(w_shape), name="weights") # not shown
b = tf.Variable(0.0, name="bias") # not shown
z = tf.add(tf.matmul(X, w), b, name="z") # not shown
return tf.maximum(z, threshold, name="max")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = []
for relu_index in range(5):
with tf.variable_scope("relu", reuse=(relu_index >= 1)) as scope:
relus.append(relu(X))
output = tf.add_n(relus, name="output")
file_writer = tf.summary.FileWriter("E:\Annconda\log", tf.get_default_graph())
file_writer.close()
后续将介绍卷积神经网络、深度神经网络、复发神经网络,以及TF的扩容。