第二阶段-tensorflow程序图文详解（三） Variables

最新推荐文章于 2023-10-26 15:39:14 发布

Alun_Sun

最新推荐文章于 2023-10-26 15:39:14 发布

阅读量1.8k

点赞数 1

分类专栏： tensorflow1.4 文章标签： tensorflow variables 集群学习深度学习

本文链接：https://blog.csdn.net/jk981811667/article/details/78901933

版权

tensorflow1.4 专栏收录该内容

36 篇文章 3 订阅

订阅专栏

本文详细介绍了TensorFlow中Variable的使用，包括创建Variable的方式、默认的变量集合、设备放置、初始化、使用变量以及变量的共享。重点讨论了tf.get_variable函数、变量的初始化以及如何在分布式环境中正确放置变量。

摘要由CSDN通过智能技术生成

A TensorFlow variable is the best way to represent shared, persistent state manipulated by your program.

Variables are manipulated via the tf.Variable class. A tf.Variable represents a tensor whose value can be changed by running ops on it. Unlike tf.Tensor objects, a tf.Variable exists outside the context of a single session.run call.

Internally, a tf.Variable stores a persistent tensor. Specific ops allow you to read and modify the values of this tensor. These modifications are visible across multiple tf.Sessions, so multiple workers can see the same values for a tf.Variable.

Creating a Variable

The best way to create a variable is to call the tf.get_variable function. This function requires you to specify the Variable’s name. This name will be used by other replicas to access the same variable, as well as to name this variable’s value when checkpointing and exporting models. tf.get_variable also allows you to reuse a previously created variable of the same name, making it easy to define models which reuse layers.
最好的创建方式是tf.get_variable方法。需要制定一个变量名。通过这个名字就能副本这个变量。tf.get_variable还能够允许你重复使用已经创建的变量名。
To create a variable with tf.get_variable, simply provide the name and shape

my_variable = tf.get_variable("my_variable", [1, 2, 3])

This creates a variable named “my_variable” which is a three-dimensional tensor with shape [1, 2, 3]. This variable will, by default, have the dtype tf.float32 and its initial value will be randomized via tf.glorot_uniform_initializer.
表示有一个tensor并且形状是shape[1,2,3]默认使用tf.float32类型初始化值

You may optionally specify the dtype and initializer to tf.get_variable. For example:
只定初始化值为0

my_int_variable = tf.get_variable("my_int_variable", [1, 2, 3], dtype=tf.int32, 
  initializer=tf.zeros_initializer)

TensorFlow provides many convenient initializers. Alternatively, you may initialize a tf.Variable to have the value of a tf.Tensor. For example:
也可以通过一个tensor初始化变量

other_variable = tf.get_variable("other_variable", dtype=tf.int32, 
  initializer=tf.constant([23, 42]))

Note that when the initializer is a tf.Tensor you should not specify the variable’s shape, as the shape of the initializer tensor will be used.
注意不能只定shape。因为shape已经被只定。

Variable collections

Because disconnected parts of a TensorFlow program might want to create variables, it is sometimes useful to have a single way to access all of them. For this reason TensorFlow provides collections, which are named lists of tensors or other objects, such as tf.Variable instances.
由于tensorflow程序部分可能是不联系的，通过一种方式去接受所有变量。tensorflow提供一个集合，这个集合能够包含一个张量list

By default every tf.Variable gets placed in the following two collections:

tf.GraphKeys.GLOBAL_VARIABLES — variables that can be shared
across multiple devices,
tf.GraphKeys.TRAINABLE_VARIABLES— variables for which TensorFlow
will calculate gradients.

默认情况下，tf.Variables提供两个集合
tf.GraphKeys.GLOBAL_VARIABLES在多个设备之间共享变量。
tf.GraphKeys.TRAINABLE_VARIABLES用来收集tensorflow计算的梯度值。

If you don’t want a variable to be trainable, add it to the tf.GraphKeys.LOCAL_VARIABLES collection instead. For example, the following snippet demonstrates how to add a variable named my_local to this collection:
如果你不想变量被训练，使用tf.GraphKeys.LOCAL_VARIABLES ，下面程序片段demo，它的用法。

my_local = tf.get_variable("my_local", shape=(), 
collections=[tf.GraphKeys.LOCAL_VARIABLES])

Alternatively, you can specify trainable=False as an argument to tf.get_variable:
还有一个办法，直接trainable=False，也可以。

my_non_trainable = tf.get_variable("my_non_trainable", 
                                   shape=(), 
                                   trainable=False)

You can also use your own collections. Any string is a valid collection name, and there is no need to explicitly create a collection. To add a variable (or any other object) to a collection after creating the variable, call tf.add_to_collection. For example, the following code adds an existing variable named my_local to a collection named my_collection_name:

我们也可以使用自己的集合。通过 tf.add_to_collection方法就能够，隐式指定自己的集合。

tf.add_to_collection("my_collection_name", my_local)

And to retrieve a list of all the variables (or other objects) you’ve placed in a collection you can use:
取回所有变量的list，可以使用下面的语句。

tf.get_collection("my_collection_name")

Device placement

Just like any other TensorFlow operation, you can place variables on particular devices. For example, the following snippet creates a variable named v and places it on the second GPU device:
像tensorflow的其他操作一样，变量也可以指定运行设备。例如使用GPU运行。

with tf.device("/device:GPU:1"):
  v = tf.get_variable("v", [1])

It is particularly important for variables to be in the correct device in distributed settings. Accidentally putting variables on workers instead of parameter servers, for example, can severely slow down training or, in the worst case, let each worker blithely forge ahead with its own independent copy of each variable. For this reason we provide tf.train.replica_device_setter, which can automatically place variables in parameter servers. For example:
变量在分布式设置中处于正确的设备中尤为重要。例如，不小心将变量放在工作者而不是参数服务器上，可能会严重减慢训练速度，或者在最坏的情况下，让每个工作节点都静静地进行各自独立的每个变量的复制。为此，我们提供了tf.train.replica_device_setter，它可以自动将参数放置在参数服务器中。例如：

cluster_spec = {
    "ps": ["ps0:2222", "ps1:2222"],
    "worker": ["worker0:2222", "worker1:2222", "worker2:2222"]}
with tf.device(tf.train.replica_device_setter(cluster=cluster_spec)):
  v = tf.get_variable("v", shape=[20, 20])  
  # 通过replica_device_setter就能够指定变量V的运行服务器。

Initializing variables

Before you can use a variable, it must be initialized. If you are programming in the low-level TensorFlow API (that is, you are explicitly creating your own graphs and sessions), you must explicitly initialize the variables. Most high-level frameworks such as tf.contrib.slim, tf.estimator.Estimator and Keras automatically initialize variables for you before training a model.
在使用变量之前，应该初始化变量。如果你的程序使用的是底层API，一定要显示初始化变量。如果你的程序使用高层API， tf.contrib.slim, tf.estimator.Estimator ，Keras 训练模型之前，将会自动初始化。

Explicit initialization is otherwise useful because it allows you not to rerun potentially expensive initializers when reloading a model from a checkpoint as well as allowing determinism when randomly-initialized variables are shared in a distributed setting.
显示初始化的用途，在通过checkpoint运行模型时，不需要从新昂贵的运行初始化计算。在集群运行下，初始化变量将会被共享。

To initialize all trainable variables in one go, before training starts, call tf.global_variables_initializer(). This function returns a single operation responsible for initializing all variables in the tf.GraphKeys.GLOBAL_VARIABLES collection. Running this operation initializes all variables. For example:
tf.global_variables_initializer().将会初始化所有在tf.GraphKeys.GLOBAL_VARIABLES 中的变量。

session.run(tf.global_variables_initializer())
# Now all variables are initialized.

If you do need to initialize variables yourself, you can run the variable’s initializer operation. For example:
初始化自己的变量。如下

session.run(my_variable.initializer)

You can also ask which variables have still not been initialized. For example, the following code prints the names of all variables which have not yet been initialized:
你还可以决定哪些变量，不被初始化。下面代码打印出没有初始化的变量名。

print(session.run(tf.report_uninitialized_variables()))

Note that by default tf.global_variables_initializer does not specify the order in which variables are initialized. Therefore, if the initial value of a variable depends on another variable’s value, it’s likely that you’ll get an error. Any time you use the value of a variable in a context in which not all variables are initialized (say, if you use a variable’s value while initializing another variable), it is best to use variable.initialized_value() instead of variable:
注意：默认情况下，tf.global_variables_initializer不能指定初始化变量的顺序。如果你的一个变量依赖另外一个变量，那么可能会报错。请使用variable.initialized_value() 代替这个依赖的变量。如下：

v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
w = tf.get_variable("w", initializer=v.initialized_value() + 1)

Using variables

To use the value of a tf.Variable in a TensorFlow graph, simply treat it like a normal tf.Tensor:

v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
w = v + 1  # w依赖v，才能计算.
           # 一个表达式中，可自由获取变量。
           # 转变成另外一个tensor。

To assign a value to a variable, use the methods assign, assign_add, and friends in the tf.Variable class. For example, here is how you can call these methods:
指定一个值到一个变量中，使用assign_add方法，请看下面代码。

v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
assignment = v.assign_add(1)
tf.global_variables_initializer().run()
assignment.run()

Most TensorFlow optimizers have specialized ops that efficiently update the values of variables according to some gradient descent-like algorithm. See tf.train.Optimizer for an explanation of how to use optimizers.
大多数tensorflow的优化，不断更新一个变量的值，例如梯度下降相类似的算法。

Because variables are mutable it’s sometimes useful to know what version of a variable’s value is being used at any point in time. To force a re-read of the value of a variable after something has happened, you can use tf.Variable.read_value. For example:
由于变量的值是在变化中的，我们使用tf.Variable.read_value去读取变量的值

v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
assignment = v.assign_add(1)
with tf.control_dependencies([assignment]):
  w = v.read_value()  # w is guaranteed to reflect v's value after the
                      # assign_add operation.

Sharing variables

TensorFlow supports two ways of sharing variables:
有两种方式，能够共享变量。

Explicitly passing tf.Variable objects around.
Implicitly wrapping tf.Variable objects within tf.variable_scope objects.

While code which explicitly passes variables around is very clear, it is sometimes convenient to write TensorFlow functions that implicitly use variables in their implementations. Most of the functional layers from tf.layer use this approach, as well as all tf.metrics, and a few other library utilities.
尽管明确地传递变量的代码非常清晰，但编写TensorFlow函数有时也很方便，它们在其实现中隐式使用变量。 tf.layer的大部分功能层都使用这种方法，以及所有的tf.metrics和其他一些库工具。

Variable scopes allow you to control variable reuse when calling functions which implicitly create and use variables. They also allow you to name your variables in a hierarchical and understandable way.
变量作用域允许您在调用隐式创建和使用变量的函数时控制变量重用。它们还允许您以分层和可理解的方式命名变量。

For example, let’s say we write a function to create a convolutional / relu layer:

def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape,
        initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape,
        initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)

This function uses short names weights and biases, which is good for clarity. In a real model, however, we want many such convolutional layers, and calling this function repeatedly would not work:
此功能使用短名称权重和偏见，这是清晰的。然而，在一个真实的模型中，我们需要许多这样的卷积图层，反复调用这个函数是行不通的：

input1 = tf.random_normal([1,10,10,32])
input2 = tf.random_normal([1,20,20,32])
x = conv_relu(input1, kernel_shape=[5, 5, 32, 32], bias_shape=[32])
x = conv_relu(x, kernel_shape=[5, 5, 32, 32], bias_shape = [32])  # This fails.

Since the desired behavior is unclear (create new variables or reuse the existing ones?) TensorFlow will fail. Calling conv_relu in different scopes, however, clarifies that we want to create new variables:
由于期望的行为不清楚（创建新的变量或重用现有的变量？）TensorFlow将失败。在不同的范围调用conv_relu，然而，澄清我们要创建新的变量：

def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])

If you do want the variables to be shared, you have two options. First, you can create a scope with the same name using reuse=True:

with tf.variable_scope("model"):
  output1 = my_image_filter(input1)
with tf.variable_scope("model", reuse=True):
  output2 = my_image_filter(input2)

You can also call scope.reuse_variables() to trigger a reuse:

with tf.variable_scope("model") as scope:
  output1 = my_image_filter(input1)
  scope.reuse_variables()
  output2 = my_image_filter(input2)

Since depending on exact string names of scopes can feel dangerous, it’s also possible to initialize a variable scope based on another one:

with tf.variable_scope("model") as scope:
  output1 = my_image_filter(input1)
with tf.variable_scope(scope, reuse=True):
  output2 = my_image_filter(input2)