模块、层和模型简介

最新推荐文章于 2023-12-14 17:21:14 发布

北京王老师

最新推荐文章于 2023-12-14 17:21:14 发布

阅读量747

点赞数

文章标签： python tensorflow 深度学习 java 人工智能

本文链接：https://blog.csdn.net/SE_JW/article/details/122110621

版权

要进行 TensorFlow 机器学习，您可能需要定义、保存和恢复模型。

抽象地说，模型是：

一个在张量上进行某些计算的函数(前向传递)
一些可以更新以响应训练的变量

在本指南中，您将深入学习 Keras，了解如何定义 TensorFlow 模型。本文着眼于 TensorFlow 如何收集变量和模型，以及如何保存和恢复它们。

注：如果您想立即开始使用 Keras，请参阅 Keras 指南集合。

设置

import tensorflow as tf
from datetime import datetime
%load_ext tensorboard

在 TensorFlow 中定义模型和层

大多数模型都由层组成。层是具有已知数学结构的函数，可以重复使用并且具有可训练的变量。在 TensorFlow 中，层和模型的大多数高级实现(例如 Keras 或 Sonnet)都在以下同一个基础类上构建：tf.Module。

下面是一个在标量张量上运行的非常简单的 tf.Module 示例：

class SimpleModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)
    self.a_variable = tf.Variable(5.0, name="train_me")
    self.non_trainable_variable = tf.Variable(5.0, trainable=False, name="do_not_train_me")
  def __call__(self, x):
    return self.a_variable * x + self.non_trainable_variable
simple_module = SimpleModule(name="simple")
simple_module(tf.constant(5.0))

<tf.Tensor: shape=(), dtype=float32, numpy=30.0>

模块和引申而来的层是“对象”的深度学习术语：它们具有内部状态以及使用该状态的方法。

__call__ 并无特殊之处，只是其行为与 Python 可调用对象类似；您可以使用任何函数来调用模型。

您可以出于任何原因开启和关闭变量的可训练性，包括在微调过程中冻结层和变量。

注：tf.Module 是 tf.keras.layers.Layer 和 tf.keras.Model 的基类，因此您在此处看到的一切内容也适用于 Keras。出于历史兼容性的原因，Keras 层不会从模块收集变量，因此您的模型应仅使用模块或仅使用 Keras 层。不过，下面给出的用于检查变量的方法相同在这两种情况下相同。

通过将 tf.Module 子类化，将自动收集分配给该对象属性的任何 tf.Variable 或 tf.Module 实例。这样，您可以保存和加载变量，还可以创建 tf.Module 的集合。

# All trainable variables
print("trainable variables:", simple_module.trainable_variables)
# Every variable
print("all variables:", simple_module.variables)

trainable variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>,)
all variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>, <tf.Variable 'do_not_train_me:0' shape=() dtype=float32, numpy=5.0>)

下面是一个由模块组成的两层线性层模型的示例。

首先是一个密集(线性)层：

class Dense(tf.Module):
  def __init__(self, in_features, out_features, name=None):
    super().__init__(name=name)
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')
  def __call__(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

随后是完整的模型，此模型将创建并应用两个层实例。

class SequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)
    self.dense_1 = Dense(in_features=3, out_features=3)
    self.dense_2 = Dense(in_features=3, out_features=2)
  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)
# You have made a model!
my_model = SequentialModule(name="the_model")
# Call it, with random results
print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

tf.Module 实例将以递归方式自动收集分配给它的任何 tf.Variable 或 tf.Module 实例。这样，您可以使用单个模型实例管理 tf.Module 的集合，并保存和加载整个模型。

print("Submodules:", my_model.submodules)

Submodules: (<__main__.Dense object at 0x7f1c39e5ca20>, <__main__.Dense object at 0x7f1c39e5ca90>)

for var in my_model.variables:
  print(var, "\n")

<tf.Variable 'b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)> 
<tf.Variable 'w:0' shape=(3, 3) dtype=float32, numpy=
array([[-0.3808288 , -0.31930852, -0.8358304 ],
       [ 1.2048029 ,  0.30923682,  0.40288115],
       [ 1.5672269 , -1.160634  , -0.44552004]], dtype=float32)> 
<tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)> 
<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
array([[ 0.9950175 , -1.077783  ],
       [-2.0974526 ,  0.92524624],
       [ 0.9254379 , -1.3660892 ]], dtype=float32)>

等待创建变量

您在这里可能已经注意到，必须定义层的输入和输出大小。这样，w 变量才会具有已知的形状并且可被分配。

通过将变量创建推迟到第一次使用特定输入形状调用模块时，您将无需预先指定输入大小。

class FlexibleDenseModule(tf.Module):
  # Note: No need for `in+features`
  def __init__(self, out_features, name=None):
    super().__init__(name=name)
    self.is_built = False
    self.out_features = out_features
  def __call__(self, x):
    # Create variables on first call.
    if not self.is_built:
      self.w = tf.Variable(
        tf.random.normal([x.shape[-1], self.out_features]), name='w')
      self.b = tf.Variable(tf.zeros([self.out_features]), name='b')
      self.is_built = True
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

# Used in a module
class MySequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)
    self.dense_1 = FlexibleDenseModule(out_features=3)
    self.dense_2 = FlexibleDenseModule(out_features=2)
  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)
my_model = MySequentialModule(name="the_model")
print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

这种灵活性是 TensorFlow 层通常仅需要指定其输出的形状(例如在 tf.keras.layers.Dense 中)，而无需指定输入和输出大小的原因。

保存权重

您可以将 tf.Module 保存为检查点和 SavedModel。

检查点即是权重(即模块及其子模块内部的变量集的值)。

chkp_path = "my_checkpoint"
checkpoint = tf.train.Checkpoint(model=my_model)
checkpoint.write(chkp_path)
checkpoint.write(chkp_path)

'my_checkpoint'

检查点由两种文件组成---数据本身以及元数据的索引文件。索引文件跟踪实际保存的内容和检查点的编号，而检查点数据包含变量值及其特性查找路径。

ls my_checkpoint*

my_checkpoint.data-00000-of-00001  my_checkpoint.index

您可以查看检查点内部，以确保整个变量集合已由包含这些变量的 Python 对象保存并排序。

tf.train.list_variables(chkp_path)

[('_CHECKPOINTABLE_OBJECT_GRAPH', []),
 ('model/dense_1/b/.ATTRIBUTES/VARIABLE_VALUE', [3]),
 ('model/dense_1/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 3]),
 ('model/dense_2/b/.ATTRIBUTES/VARIABLE_VALUE', [2]),
 ('model/dense_2/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 2])]

在分布式(多机)训练期间，可以将它们分片，这就是要对它们进行编号(例如 '00000-of-00001')的原因。不过，在本例中，只有一个分片。

重新加载模型时，将重写 Python 对象中的值。

new_model = MySequentialModule()
new_checkpoint = tf.train.Checkpoint(model=new_model)
new_checkpoint.restore("my_checkpoint")
# Should be the same result as above
new_model(tf.constant([[2.0, 2.0, 2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[0., 0.]], dtype=float32)>

注：由于检查点处于长时间训练工作流的核心位置，因此 tf.checkpoint.CheckpointManager 是一个可使检查点管理变得更简单的辅助类。有关更多详细信息，请参阅指南。

保存函数

TensorFlow 可以在不使用原始 Python 对象的情况下运行模型，如 TensorFlow Serving 和 TensorFlow Lite 中所见，甚至当您从 TensorFlow Hub 下载经过训练的模型时也是如此。

TensorFlow 需要了解如何执行 Python 中描述的计算，但不需要原始代码。为此，您可以创建一个计算图，如上一篇指南中所述。

此计算图中包含实现函数的运算。

您可以通过添加 @tf.function 装饰器在上面的模型中定义计算图，以指示此代码应作为计算图运行。

class MySequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)
    self.dense_1 = Dense(in_features=3, out_features=3)
    self.dense_2 = Dense(in_features=3, out_features=2)
  @tf.function
  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)
# You have made a model with a graph!
my_model = MySequentialModule(name="the_model")

您创建的模块的工作原理与之前完全相同。传递给函数的每个唯一签名都会创建一个单独的计算图。有关详细信息，请参阅计算图指南。

print(my_model([[2.0, 2.0, 2.0]]))
print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

tf.Tensor(

您可以通过在 TensorBoard 摘要中跟踪计算图来将其可视化。

# Set up logging.
stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = "logs/func/%s" % stamp
writer = tf.summary.create_file_writer(logdir)
# Create a new model to get a fresh trace
# Otherwise the summary will not see the graph.
new_model = MySequentialModule()
# Bracket the function call with
# tf.summary.trace_on() and tf.summary.trace_export().
tf.summary.trace_on(graph=True, profiler=True)
# Call only one tf.function when tracing.
z = print(new_model(tf.constant([[2.0, 2.0, 2.0]])))
with writer.as_default():
  tf.summary.trace_export(
      name="my_func_trace",
      step=0,
      profiler_outdir=logdir)

WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/summary_ops_v2.py:1297: start (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.start` instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/summary_ops_v2.py:1353: stop (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.stop` instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/summary_ops_v2.py:1353: save (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/eager/profiler.py:151: maybe_create_event_file (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.

启动 Tensorboard 以查看生成的跟踪：

%tensorboard --logdir logs/func

创建 `SavedModel`

共享经过完全训练的模型的推荐方式是使用 SavedModel。SavedModel 包含函数集合与权重集合。

您可以保存刚刚创建的模型。

tf.saved_model.save(my_model, "the_saved_model")

INFO:tensorflow:Assets written to: the_saved_model/assets

# Inspect the in the directory

ls -l the_saved_model

total 24
drwxr-sr-x 2 kbuilder kokoro  4096 Feb 11 19:07 assets
-rw-rw-r-- 1 kbuilder kokoro 14140 Feb 11 19:07 saved_model.pb
drwxr-sr-x 2 kbuilder kokoro  4096 Feb 11 19:07 variables

# The variables/ directory contains a checkpoint of the variables

ls -l the_saved_model/variables

total 8
-rw-rw-r-- 1 kbuilder kokoro 408 Feb 11 19:07 variables.data-00000-of-00001
-rw-rw-r-- 1 kbuilder kokoro 356 Feb 11 19:07 variables.index

saved_model.pb 文件是一个描述函数式 tf.Graph 的协议缓冲区。

可以从此表示加载模型和层，而无需实际构建创建该表示的类的实例。在您没有(或不需要)Python 解释器(例如大规模应用或在边缘设备上)，或者在原始 Python 代码不可用或不实用的情况下，这样做十分理想。

您可以将模型作为新对象加载：

new_model = tf.saved_model.load("the_saved_model")

通过加载已保存模型创建的 new_model 是 TensorFlow 内部的用户对象，无需任何类知识。它不是 SequentialModule 类型的对象。

isinstance(new_model, SequentialModule)

False

此新模型适用于已定义的输入签名。您不能向以这种方式恢复的模型添加更多签名。

print(my_model([[2.0, 2.0, 2.0]]))
print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

tf.Tensor(

因此，利用 SavedModel，您可以使用 tf.Module 保存 TensorFlow 权重和计算图，随后再次加载它们。

Keras 模型和层

请注意，到目前为止，还没有提到 Keras。您可以在 tf.Module 上构建自己的高级 API，而我们已经拥有这些 API。

在本部分中，您将研究 Keras 如何使用 tf.Module。可在 Keras 指南中找到有关 Keras 模型的完整用户指南。

Keras 层

tf.keras.layers.Layer 是所有 Keras 层的基类，它继承自 tf.Module。

您只需换出父项，然后将 __call__ 更改为 call 即可将模块转换为 Keras 层：

class MyDense(tf.keras.layers.Layer):
  # Adding **kwargs to support base Keras layer arguemnts
  def __init__(self, in_features, out_features, **kwargs):
    super().__init__(**kwargs)
    # This will soon move to the build step; see below
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')
  def call(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)
simple_layer = MyDense(name="simple", in_features=3, out_features=3)

Keras 层有自己的 __call__，它会进行下一部分中所述的某些簿记，然后调用 call()。您应当不会看到功能上的任何变化。

simple_layer([[2.0, 2.0, 2.0]])

`build` 步骤

如上所述，在您确定输入形状之前，等待创建变量在许多情况下十分方便。

Keras 层具有额外的生命周期步骤，可让您在定义层时获得更高的灵活性。这是在 build() 函数中定义的。

build 仅被调用一次，而且是使用输入的形状调用的。它通常用于创建变量(权重)。

您可以根据输入的大小灵活地重写上面的 MyDense 层。

class FlexibleDense(tf.keras.layers.Layer):
  # Note the added `**kwargs`, as Keras supports many arguments
  def __init__(self, out_features, **kwargs):
    super().__init__(**kwargs)
    self.out_features = out_features
  def build(self, input_shape):  # Create the state of the layer (weights)
    self.w = tf.Variable(
      tf.random.normal([input_shape[-1], self.out_features]), name='w')
    self.b = tf.Variable(tf.zeros([self.out_features]), name='b')
  def call(self, inputs):  # Defines the computation from inputs to outputs
    return tf.matmul(inputs, self.w) + self.b
# Create the instance of the layer
flexible_dense = FlexibleDense(out_features=3)

此时，模型尚未构建，因此没有变量。

flexible_dense.variables

[]

调用该函数会分配大小适当的变量。

# Call it, with predictably random results
print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0], [3.0, 3.0, 3.0]])))

Model results: tf.Tensor(
[[ 2.5384042  -4.430261   -0.39644408]
 [ 3.8076067  -6.6453915  -0.594666  ]], shape=(2, 3), dtype=float32)

flexible_dense.variables

[<tf.Variable 'flexible_dense/w:0' shape=(3, 3) dtype=float32, numpy=
 array([[-0.22142771, -0.58348554,  0.07171401],
        [ 0.4220801 , -1.597373  , -0.8835892 ],
        [ 1.0685498 , -0.03427194,  0.6136532 ]], dtype=float32)>,
 <tf.Variable 'flexible_dense/b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

由于仅调用一次 build，因此如果输入形状与层的变量不兼容，输入将被拒绝。

try:
  print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0, 2.0]])))
except tf.errors.InvalidArgumentError as e:
  print("Failed:", e)

Failed: Matrix size-incompatible: In[0]: [1,4], In[1]: [3,3] [Op:MatMul]

Keras 层具有许多额外的功能，包括：

可选损失
对指标的支持
对可选 training 参数的内置支持，用于区分训练和推断用途
get_config 和 from_config 方法，允许您准确存储配置以在 Python 中克隆模型

在自定义层的完整指南中阅读关于它们的信息。

Keras 模型

您可以将模型定义为嵌套的 Keras 层。

但是，Keras 还提供了称为 tf.keras.Model 的全功能模型类。它继承自 tf.keras.layers.Layer，因此 Keras 模型是一种 Keras 层，支持以同样的方式使用、嵌套和保存。Keras 模型还具有额外的功能，这使它们可以轻松训练、评估、加载、保存，甚至在多台机器上进行训练。

您可以使用几乎相同的代码定义上面的 SequentialModule，再次将 __call__ 转换为 call() 并更改父项。

class MySequentialModel(tf.keras.Model):
  def __init__(self, name=None, **kwargs):
    super().__init__(**kwargs)
    self.dense_1 = FlexibleDense(out_features=3)
    self.dense_2 = FlexibleDense(out_features=2)
  def call(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)
# You have made a Keras model!
my_sequential_model = MySequentialModel(name="the_model")
# Call it on a tensor, with random results
print("Model results:", my_sequential_model(tf.constant([[2.0, 2.0, 2.0]])))

Model results: tf.Tensor([[-12.241047    2.6145513]], shape=(1, 2), dtype=float32)

所有相同的功能都可用，包括跟踪变量和子模块。

注：为了强调上面的注意事项，嵌套在 Keras 层或模型中的原始 tf.Module 将不会收集其变量以用于训练或保存。相反，它会在 Keras 层内嵌套 Keras 层。

my_sequential_model.variables

[<tf.Variable 'my_sequential_model/flexible_dense_1/w:0' shape=(3, 3) dtype=float32, numpy=
 array([[ 2.0393457 , -0.26025367, -0.45268586],
        [-0.29897013,  0.18454784, -0.586858  ],
        [ 1.5166968 ,  0.3722855 , -0.200079  ]], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_1/b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_2/w:0' shape=(3, 2) dtype=float32, numpy=
 array([[-2.5438178 ,  0.2093776 ],
        [ 0.63271874, -0.33846512],
        [-1.5950202 , -0.5854196 ]], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_2/b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>]

my_sequential_model.submodules

(<__main__.FlexibleDense at 0x7f1ccc738dd8>,
 <__main__.FlexibleDense at 0x7f1ccc738588>)

重写 tf.keras.Model 是一种构建 TensorFlow 模型的极 Python 化方式。如果要从其他框架迁移模型，这可能非常简单。

如果要构造的模型是现有层和输入的简单组合，则可以使用函数式 API 节省时间和空间，此 API 附带有关模型重构和架构的附加功能。

下面是使用函数式 API 构造的相同模型：

inputs = tf.keras.Input(shape=[3,])
x = FlexibleDense(3)(inputs)
x = FlexibleDense(2)(x)
my_functional_model = tf.keras.Model(inputs=inputs, outputs=x)
my_functional_model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 3)]               0         
_________________________________________________________________
flexible_dense_3 (FlexibleDe (None, 3)                 12        
_________________________________________________________________
flexible_dense_4 (FlexibleDe (None, 2)                 8         
=================================================================
Total params: 20
Trainable params: 20
Non-trainable params: 0
_________________________________________________________________

my_functional_model(tf.constant([[2.0, 2.0, 2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[-17.974905 ,   2.8019438]], dtype=float32)>

这里的主要区别在于，输入形状是作为函数构造过程的一部分预先指定的。在这种情况下，不必完全指定 input_shape 参数；您可以将某些维度保留为 None。

注：您无需在子类化模型中指定 input_shape 或 InputLayer；这些参数和层将被忽略。

保存 Keras 模型

可以为 Keras 模型创建检查点，这看起来和 tf.Module 一样。

Keras 模型也可以使用 tf.saved_models.save() 保存，因为它们是模块。但是，Keras 模型具有更方便的方法和其他功能。

my_sequential_model.save("exname_of_file")

INFO:tensorflow:Assets written to: exname_of_file/assets

同样地，它们也可以轻松重新加载。

reconstructed_model = tf.keras.models.load_model("exname_of_file")

Keras SavedModels 还可以保存指标、损失和优化器状态。

可以使用此重构模型，并且在相同数据上调用时会产生相同的结果。

reconstructed_model(tf.constant([[2.0, 2.0, 2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[-12.241047 ,   2.6145513]], dtype=float32)>

有关保存和序列化 Keras 模型，包括为自定义层提供配置方法来为功能提供支持的更多信息，请参阅保存和序列化指南。

后续步骤

如果您想了解有关 Keras 的更多详细信息，可以在此处查看现有的 Keras 指南。

在 tf.module 上构建的高级 API 的另一个示例是 DeepMind 的 Sonnet，其网站上有详细介绍。

北京王老师

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
模块、层和模型简介

要进行 TensorFlow 机器学习，您可能需要定义、保存和恢复模型。抽象地说，模型是：一个在张量上进行某些计算的函数(前向传递)一些可以更新以响应训练的变量在本指南中，您将深入学习 Keras，了解如何定义 TensorFlow 模型。本文着眼于 TensorFlow 如何收集变量和模型，以及如何保存和恢复它们。注：如...
复制链接

扫一扫