机器学习实战（九）：Up and Running with TensorFlow

一城山河

于 2020-12-19 21:53:53 发布

阅读量235

点赞数

分类专栏：机器学习文章标签： tensorflow python

本文链接：https://blog.csdn.net/qq_40594395/article/details/111405172

版权

机器学习专栏收录该内容

14 篇文章 3 订阅

订阅专栏

首先介绍TensorFlow，它是一个用来数值计算的大型开源软件库，非常适合大型ML。
原理：在Python中定义一个用来计算的图，TensorFlow就会将这个图用C++计算出来。
TensorFlow优点此处不予介绍。

Installation

仅针对 hands on Machine Learning这本书：https://blog.csdn.net/qq_40594395/article/details/111161561

Creating Your First Graph and Running It in a Session

构建一个计算图：

import tensorflow as tf
reset_graph()

x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

**方法一：**但这个图其实没有任何计算，仅是创建图，甚至没有初始化变量。若要执行，则需要打开一个TF对话，初始化所有变量，求职，并关闭（释放资源）：

sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result = sess.run(f)
print(result)
sess.close()

**方法二：**但是这样看起来很不方便且很臃肿，改进：

with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    result = f.eval()

以上代码中，with引导了一个默认会话，x.initializer.run()等价于tf.get_default_session().run(x.initializer)，f.eavl()=tf.get_default_session().run(f)。
这种写法增加可读性且使会话在块结束时自动关闭。

**方法三：**除了手工为每个变量调动初始化器之外，还可以用global_variables_initializer()完成，但它并不立即初始化，而是在图中创建一个节点，这个节点在会话执行时初始所有变量：

init = tf.global_variables_initializer()
with tf.Session() as sess:
    init.run()
    result = f.eval()

**方法四：**还有一种做法：就是创建InteractiveSession，与常规会话区别在于它将自己设为默认会话，因此无需设置with块（但需要手动关闭会话）：

sess = tf.InteractiveSession()
init.run()
result = f.eval()
print(result)
sess.close()

一个TF程序分为两部分：

构建计算图（构建阶段）：展现ML模型与所需计算
执行图（执行阶段）：重复执行每一步训练动作，逐步优化参数。

下面来看一个例子：

Managing Graphs

我们所创建的所有节点都会自动添加到默认图上：

reset_graph()
x1 = tf.Variable(1)
x1.graph is tf.get_default_graph()#True

如果我们先想要管理多个互不依赖的图，可以创建一个新图，然后用with临时模块设置成默认图：


graph = tf.Graph()
with graph.as_default():
    x2 = tf.Variable(2)

在临时图上：x2.graph is graph——True，不在默认图上：x2.graph is tf.get_default_graph()——False

Jupyter和Shell中，常常会多次执行同一条命令，这样可能在同一个图上添加了很多重复节点，有两种做法：

重启
通过tf.reset_default_graph()重置默认图（推荐）

Lifecycle of a Node Value

求值某个节点时，TF自动监测该节点依赖的节点，先对这些依赖求值，举个例子：

w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3

with tf.Session() as sess:
    print(y.eval())  # 10
    print(z.eval())  # 15

在启动会话后，开始执行y，y依赖于x，计算x，x依赖于w，先求w，然后依次倒序返回y值；同理z，倒叙返回。这段会话中，x，w被计算了两次。

图每次执行间，所有节点都被抛弃，但变量值不会（变量值由会话维护），也就是说，变量生命周期从初始化器的执行开始，到关闭会话结束。

如果对于上述代码，你不希望y、z重复求值，那么必须告诉TF在一次图的执行过程中完成对y和z的求值：

with tf.Session() as sess:
    y_val, z_val = sess.run([y, z])
    print(y_val)  # 10
    print(z_val)  # 15

单进程TF中，即使他们共享同一个计算图，多个会话之间依然不共享任何状态，而分布式TF多个对话会共享同一变量。

Linear Regression with TensorFlow

TF操作可接受任意数量输出，产生任意数量输入。输入和输出都是多维数组叫张量（tensor），tensor有其形状和类型。Python中可以用NumPy中的ndarrays表示。通常保存浮点型数据或存储字符串。
tensor可以对任意形状的数组进行计算，下列代码展示，如何操作二维数组进行加州住房数据线性回归。

import numpy as np
from sklearn.datasets import fetch_california_housing

reset_graph()

housing = fetch_california_housing()#获取数据
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]#添加额外偏移（x0=1,NumPy操作，立即执行）
#创造两个TF常量节点，还有target（housing.target为一维数组，reshape接受-1（未指定）作为参数，根据数组长和剩余维度计算）
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)#计算theta，不立即执行，只是定义图中节点，具体要等到图运行时才发生，theta使用正规方程计算

with tf.Session() as sess:#求值
    theta_value = theta.eval()

NumPy版本：


X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

print(theta_numpy

Scikit-Learn版本

from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))

print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])

与直接用NumPy计算相比，上述代码如果你有GPU，TensorFlow自动把计算分发到GPU上（带GPU支持的TensorFlow版本）

Implementing Gradient Descent

先手工计算梯度，再使用TF的自动微分特性自动计算梯度，最后学习TF内置的众多优化器。

当使用GD时，要先对输入特征向量归一化处理，否则训练非常慢，可以用TensorFlow、NumPY、Scikit-Learn的StandardScaler。
归一化数据：

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]

Manually Computing the Gradients（手动计算梯度）

reset_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
#tf.random_uniform()在图中创建一个节点，生成一个张量，函数根据传入的形状和值域生成随机值来填充这个张量，和NumPy的rand()相似。
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
gradients = 2/m * tf.matmul(tf.transpose(X), error)
#assign()创建一个为变量赋值的节点，实现批量梯度下降step_theta_next_step = theta - yeta*MSE(theta)
training_op = tf.assign(theta, theta - learning_rate * gradients)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):#不断执行训练步骤，100此迭代后答应MSE
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())
        sess.run(training_op)
    
    best_theta = theta.eval()

Using autodiff(使用自动微分)

对于处理深度神经网络来说，用数学方式从成本函数（MSE）计算出梯度琐碎且容易出错，可以通过符号微分自动求出偏导方程，但代码不一定高效。

TF的autodiff()可以自动高效地计算出梯度：

reset_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
#tf.gradients()接受一个操作符（mse）和一个参数列表（theta）作为参数，创建一个操作符列表计算每个变量梯度：梯度节点将计算MSE对于theta的梯度向量
gradients = tf.gradients(mse, [theta])[0]#区别gradients = 2/m * tf.matmul(tf.transpose(X), error)

training_op = tf.assign(theta, theta - learning_rate * gradients)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())
        sess.run(training_op)
    
    best_theta = theta.eval()

print("Best theta:")
print(best_theta)

autodiff()适用于多个输出少量输出的场景，只需要n+1次遍历，就可以求出所有输出对于输入的偏导。

自动计算梯度的主要算法：
在这里插入图片描述

Using an Optimizer（使用优化器）

TensorFlow还内置了许多优化器，包括梯度下降优化器：

reset_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
#只修改了如下两行代码
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())
        sess.run(training_op)
    
    best_theta = theta.eval()

print("Best theta:")
print(best_theta)

若要使用动量优化器（momentum optimizer）——比GradientDescentOptimizer收敛快得多，可以这样定义：
把上面两行代码换成optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9)

Feeding Data to the Training Algorithm

下面把上面代码改成Mini-batch Gradient Descent，所以需要一种每次迭代把X、y替换成下一个小批量的算法。最简单的方法时用占位符（placeholder）节点，占位符非常特别，它并不进行任何计算，而是再运行时替换输出值。TF中用来再训练中传值，如果运行时不为占位符指定值，将会得到异常。

要创建一个占位符节点，需要调用placeholder()指定输出tensor类型或形状（None代表任意尺寸）。下面创建一个占位符节点A，同时创建节点B，B=A+5，当对B求值时，用eval()传入一个feed_dict,指定A值（A必须有三列，可以有任意多行）

reset_graph()

A = tf.placeholder(tf.float32, shape=(None, 3))
B = A + 5
with tf.Session() as sess:
    B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
    B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

print(B_val_1)#[6. 7. 8.]]
print(B_val_2)#[[ 9. 10. 11.]  [12. 13. 14.]]

可以输入任意操作的输入，不近几十年hi占位符，TF不会求值而是用你传的值

实现Mini-batch GB时，需要在构造阶段把X，y定义为占位符节点：

n_epochs = 1000
learning_rate = 0.01

reset_graph()

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

定义theta、MSE，梯度向量，优化器等，并初始


theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

定义取小批量周期：n_epochs = 10
定义取小批量轮次：batch_size = 100 ；n_batches = int(np.ceil(m / batch_size))
定义取小批量算法：

def fetch_batch(epoch, batch_index, batch_size):
    np.random.seed(epoch * n_batches + batch_index)  # not shown in the book
    indices = np.random.randint(m, size=batch_size)  # not shown
    X_batch = scaled_housing_data_plus_bias[indices] # not shown
    y_batch = housing.target.reshape(-1, 1)[indices] # not shown
    return X_batch, y_batch

执行：

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

    best_theta = theta.eval()

求值时theta无需为X，y传值，因为theta不依赖他们。

Saving and Restoring Models

一旦训练好了模型，就得把模型参数保留到硬盘上（可在任意时刻使用这些参数或用在其他地方），另外，我们可能希望在训练过程中期保存checkpoint，可以理解为游戏中的存档，当down 机时可以恢复。

TF中，在构造期末尾（即在所有变量节点创建后），创建一个Saver节点，在执行期，调用save()方法，并传入一个会话和 checkpoint文件路径。

reset_graph()

n_epochs = 1000                                                                       # not shown in the book
learning_rate = 0.01                                                                  # not shown

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")            # not shown
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")            # not shown
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")                                      # not shown
error = y_pred - y                                                                    # not shown
mse = tf.reduce_mean(tf.square(error), name="mse")                                    # not shown
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)            # not shown
training_op = optimizer.minimize(mse)                                                 # not shown

init = tf.global_variables_initializer()
saver = tf.train.Saver()#在所有初始化后再创造saver节点

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())                                # not shown
            save_path = saver.save(sess, "/tmp/my_model.ckpt")#一定要先创造一个path of tmp
        sess.run(training_op)
    
    best_theta = theta.eval()
    save_path = saver.save(sess, "/tmp/my_model_final.ckpt")

恢复模型时：再构造期末尾创建一个Saver节点，在执行期用Saver对象的restore()方法：

with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_model_final.ckpt")
    best_theta_restored = theta.eval() # not shown in the book

Saver默认按照变量名保存恢复变量，如果想做更多控制，可以在保存或恢复时指定名称：saver = tf.train.Saver({"weights": theta})
但Saver不保存计算图本身，可以用tf.train.import_meta_graph()来保存图结构：

reset_graph()
# notice that we start with an empty graph.

saver = tf.train.import_meta_graph("/tmp/my_model_final.ckpt.meta")  # this loads the graph structure
theta = tf.get_default_graph().get_tensor_by_name("theta:0") # not shown in the book

with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_model_final.ckpt")  # this restores the graph's state
    best_theta_restored = theta.eval() # not shown in the book

Visualizing the Graph and Training Curves Using TensorBoard

现在有一个可以用一个Mini-batch GB训练线性回归模型的计算图，可以周期性地将检查点保存起来。TensorBoard(以下简称TB)可以在浏览器中将交互方式展示出来（如学习曲线），还可以查看图，发现图的错误及瓶颈。

修改程序，使其可以看到图的定义和训练状态，像MSE，写入到一个TB读取的log文件夹中，每次运行时，都需要指定一个不同目录，否则TB会整合这些状态信息，可视化将会乱掉——解决方法是用时间戳来命名这些日志文件夹。

先用tfgraphviz查看图的定义：

#Visualizing the graph
import os
os.environ["PATH"] += os.pathsep + "D:\Graphviz\bin"
import tensorflow as tf
import tfgraphviz as tfg
g = tfg.board(tf.get_default_graph())
g.view()

在这里插入图片描述

然后TB实现：

#Using TensorBoard

reset_graph()

from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")#时间戳
#以下常规操作
n_epochs = 1000
learning_rate = 0.01

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter('E:\Annconda\log', tf.get_default_graph())#指定路径

#小批量
n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:                                                        # not shown in the book
    sess.run(init)                                                                # not shown

    for epoch in range(n_epochs):                                                 # not shown
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            if batch_index % 10 == 0:
                summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
                step = epoch * n_batches + batch_index
                file_writer.add_summary(summary_str, step)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

    best_theta = theta.eval()     

file_writer.close()#关闭文件流

图忘了截，看书上先：在这里插入图片描述

具体使用：https://blog.csdn.net/qq_40594395/article/details/111312539

Name Scopes

处理神经网络等复杂模型时，图很容易杂乱而庞大，所以可以用命名作用域来将相关节点分组，修改上述代码，将error和mse 操作定义到一个叫loss的scope中。

reset_graph()
n_epochs = 1000
learning_rate = 0.01

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")


with tf.name_scope("loss") as scope:
    error = y_pred - y
    mse = tf.reduce_mean(tf.square(error), name="mse")
    
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter('E:\Annconda\log', tf.get_default_graph())


n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            if batch_index % 10 == 0:
                summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
                step = epoch * n_batches + batch_index
                file_writer.add_summary(summary_str, step)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

    best_theta = theta.eval()

file_writer.flush()
file_writer.close()
print("Best theta:")
print(best_theta)

该作用域内每个操作都有“loss”前缀：loss/sub

在TB中，mse和error都显示在“loss”命名空间中，默认收起（图片忘了截图，勉强看下官方）。

在这里插入图片描述

Modularity(模块化)：

如果像创建一个计算两个修正线性单元（ReLU）之和的图，ReLU会计算输入的线性函数，如果值为正，则输出，值为负，返回0：
在这里插入图片描述
TF建议用一个函数创建ReLU，以下代码创建了5个ReLU，并输出了他们的和：

#Modularity
reset_graph()

def relu(X):
    w_shape = (int(X.get_shape()[1]), 1)
    w = tf.Variable(tf.random_normal(w_shape), name="weights")
    b = tf.Variable(0.0, name="bias")
    z = tf.add(tf.matmul(X, w), b, name="z")
    return tf.maximum(z, 0., name="relu")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")#add_n()创建了一个计算一个tensor列表的和的操作

看下TF：file_writer = tf.summary.FileWriter("E:/Annconda/log/", tf.get_default_graph())
Anaaconda启动：tensorboard --logdir=路径#（我这里用的是E:/Annconda/log/）
在这里插入图片描述
创建节点时，TF会检查此名字是否存在，已经存在会加_和索引保证唯一性。
使用命名作用域,relu()放入一个域:

reset_graph()

def relu(X):
    with tf.name_scope("relu"):
        w_shape = (int(X.get_shape()[1]), 1)                          # not shown in the book
        w = tf.Variable(tf.random_normal(w_shape), name="weights")    # not shown
        b = tf.Variable(0.0, name="bias")                             # not shown
        z = tf.add(tf.matmul(X, w), b, name="z")                      # not shown
        return tf.maximum(z, 0., name="max")                          # not shown
        
n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")
file_writer = tf.summary.FileWriter("E:\Annconda\log", tf.get_default_graph())
file_writer.close()

再通过tensorboard --logdir=路径#刷新一下TB。

Sharing Variables（共享变量）

如果想在图的不同组件中共享变量，最简单的做法时先创建，然后传参。举个例子，通过一个共享的阈值变量来控制所有ReLU的阈值，可以先创建这个变量，再传给relu()。

reset_graph()

def relu(X, threshold):
    with tf.name_scope("relu"):
        w_shape = (int(X.get_shape()[1]), 1)                        # not shown in the book
        w = tf.Variable(tf.random_normal(w_shape), name="weights")  # not shown
        b = tf.Variable(0.0, name="bias")                           # not shown
        z = tf.add(tf.matmul(X, w), b, name="z")                    # not shown
        return tf.maximum(z, threshold, name="max")

threshold = tf.Variable(0.0, name="threshold")#设为标量
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X, threshold) for i in range(5)]#传入relu
output = tf.add_n(relus, name="output")

这当然可行，但是如果共享参数太多，一直将其作为参数到处传递会非常麻烦。有以下几种方法：

创建一个包含所有变量的字典，传递给每一个函数。
为每个模块都创造一个类(ReLU类用类变量持有共享参数)
调用时将共享变量作为relu()函数的一个属性。

用方法3：

#第一次传入创建为relu属性

reset_graph()

def relu(X):
    with tf.name_scope("relu"):
        if not hasattr(relu, "threshold"):#hasattr，检查relu是否有threshold属性，无则添加
            relu.threshold = tf.Variable(0.0, name="threshold")
        w_shape = int(X.get_shape()[1]), 1                          # not shown in the book
        w = tf.Variable(tf.random_normal(w_shape), name="weights")  # not shown
        b = tf.Variable(0.0, name="bias")                           # not shown
        z = tf.add(tf.matmul(X, w), b, name="z")                    # not shown
        return tf.maximum(z, relu.threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

TF的另一个选择（方法4）,可以使代码更清晰，更模块化，使用也更频繁。如果共享变量不存在，该方法通过get_variable()创造共享变量，否则复用。期望行为通过variable_scope()一个属性来控制（创造或复用）：以下代码创造一个“relu/threshold”的变量（shape=(),所以结果是标量，以0.0为初始值）：

reset_graph()

with tf.variable_scope("relu"):
    threshold = tf.get_variable("threshold", shape=(),
                                initializer=tf.constant_initializer(0.0))

若变量被get_variable创建调用过，会抛出异常，避免误操作复用。复用时需要设置变量作用域reuse = True显示实现：

with tf.variable_scope("relu", reuse=True):
    threshold = tf.get_variable("threshold")

不存在时或者没有创建成功抛出异常，另一种方式：

with tf.variable_scope("relu") as scope:
    scope.reuse_variables()
    threshold = tf.get_variable("threshold")

reuse一旦设True，该块中不能再设为False。如果常见另外作用域，自动继承，只有通过get_variable创建的变量才能复用。

以下代码创建5个ReLU，并复用threshold：

reset_graph()

def relu(X):
    with tf.variable_scope("relu", reuse=True):
        threshold = tf.get_variable("threshold")
        w_shape = int(X.get_shape()[1]), 1                          # not shown
        w = tf.Variable(tf.random_normal(w_shape), name="weights")  # not shown
        b = tf.Variable(0.0, name="bias")                           # not shown
        z = tf.add(tf.matmul(X, w), b, name="z")                    # not shown
        return tf.maximum(z, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"):
    threshold = tf.get_variable("threshold", shape=(),
                                initializer=tf.constant_initializer(0.0))
relus = [relu(X) for relu_index in range(5)]
output = tf.add_n(relus, name="output")
file_writer = tf.summary.FileWriter("E:\Annconda\log", tf.get_default_graph())
file_writer.close()

在这里插入图片描述

#解决threshold当以在relu外，其他relus部分在内通过第一次创建threshold，后续复用，get_variable保证不用管服用还是创建
reset_graph()

def relu(X):
    threshold = tf.get_variable("threshold", shape=(),
                                initializer=tf.constant_initializer(0.0))
    w_shape = (int(X.get_shape()[1]), 1)                        # not shown in the book
    w = tf.Variable(tf.random_normal(w_shape), name="weights")  # not shown
    b = tf.Variable(0.0, name="bias")                           # not shown
    z = tf.add(tf.matmul(X, w), b, name="z")                    # not shown
    return tf.maximum(z, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = []
for relu_index in range(5):
    with tf.variable_scope("relu", reuse=(relu_index >= 1)) as scope:
        relus.append(relu(X))
output = tf.add_n(relus, name="output")
file_writer = tf.summary.FileWriter("E:\Annconda\log", tf.get_default_graph())
file_writer.close()

在这里插入图片描述
后续将介绍卷积神经网络、深度神经网络、复发神经网络，以及TF的扩容。

一城山河

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
机器学习实战（九）：Up and Running with TensorFlow

首先介绍TensorFlow，它是一个用来数值计算的大型开源软件库，非常适合大型ML。原理：在Python中定义一个用来计算的图，TensorFlow就会将这个图用C++计算出来。TensorFlow优点此处不予介绍。Installation仅针对 hands on Machine Learning这本书：https://blog.csdn.net/qq_40594395/article/details/111161561Creating Your First Graph and Running
复制链接

扫一扫

专栏目录