深入理解tf.Variable、tf.get_variable、tf.variable_scope与tf.name_scope

最新推荐文章于 2022-10-12 14:40:03 发布

edward_zcl

最新推荐文章于 2022-10-12 14:40:03 发布

阅读量1.1k

点赞数 4

分类专栏： Python使用技巧人工智能-神经网络

本文链接：https://blog.csdn.net/edward_zcl/article/details/99082679

版权

人工智能-神经网络同时被 2 个专栏收录

175 篇文章 25 订阅

订阅专栏

Python使用技巧

151 篇文章 19 订阅

订阅专栏

很多博客都有讲tf.Variable、tf.get_variable、tf.variable_scope与tf.name_scope的使用，但是它们要么不全，要么很片面。本博客打算从全面与详细两个方面进行深入探讨它们的用法。

首先纠正一点，很多博客说tf.variable_scope是变量作用域，其实要这么说也无妨，但是实际上它并不那么严格，因为，这里的作用域只是用于命名与搜索，并不具有局部有效，全局无效的功能作用(比如：函数，对象，文件等)。

先说几个讲的比较详细，但是不够全面的例子。

Tensorflow函数说明（4）—— variable_scope/name_scope

参考：https://blog.csdn.net/qq_19918373/article/details/69499091
主要针对 tf.get_variable 来介绍共享变量的用法。

tf.get_variable 与 tf.variable 的用法不同。前者在创建变量时会查名字，如果给的名字在之前已经被别的变量占用，则会报错，不会创建相应变量。而后者并不进行检查，如果有重复，则自动的修改名字，加上数字来进行区别。所以从这来看要想共享变量并不能通过使用相同的名字来调用多次 tf.get_variable 和 tf.variable 做到。

比如下面这样的代码：

def my_image_filter(input_images):
    conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv1_weights")
    conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
    conv1 = tf.nn.conv2d(input_images, conv1_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    relu1 = tf.nn.relu(conv1 + conv1_biases)
 
    conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv2_weights")
    conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
    conv2 = tf.nn.conv2d(relu1, conv2_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv2 + conv2_biases)

在这个函数中，我们有 'conv1_weights'，'conv1_biases'，'conv2_weights'，'conv2_biases' 4个变量。如果我们重用这个函数，则会产生多组变量，并不会使用相同的变量，如下面调用：

# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)

上面实际上用两个不同的滤波器对 image1 和 image2 进行滤波，虽然用的是相同的函数。所以呢，这就产生了问题，下面介绍如何进行变量共享。

我们使用 with tf.variable_scope 来进行共享。比如有下面的代码：

def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape,
        initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape,
        initializer=tf.constant_intializer(0.0))
    conv = tf.nn.conv2d(input, weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)
def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])

若要调用两次 my_image_filter 并且使用相同的变量，则如下所示：

with tf.variable_scope("image_filters") as scope:
    result1 = my_image_filter(image1)
    scope.reuse_variables()
    result2 = my_image_filter(image2)

利用 reuse_variables() 来使变量重用。值得注意的是下面的代码解释了 tf.get_variable 工作原理：

with tf.variable_scope("foo"):
    v = tf.get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
    v1 = tf.get_variable("v", [1])
assert v1 == v

如果 reuse 开启，当检查到有相同的名字时，直接返回那个有相同名字的变量而不是重新定义一个再复制值。

下面是使用时需要注意的地方

1.在 variable_scope 里面的 variable_scope 会继承上面的 reuse 值，即上面一层开启了 reuse ，则下面的也跟着开启。但是不能人为的设置 reuse 为 false ，只有退出 variable_scope 才能让 reuse 变为 false：

with tf.variable_scope("root"):
    # At start, the scope is not reusing.
    assert tf.get_variable_scope().reuse == False
    with tf.variable_scope("foo"):
        # Opened a sub-scope, still not reusing.
        assert tf.get_variable_scope().reuse == False
    with tf.variable_scope("foo", reuse=True):
        # Explicitly opened a reusing scope.
        assert tf.get_variable_scope().reuse == True
        with tf.variable_scope("bar"):
            # Now sub-scope inherits the reuse flag.
            assert tf.get_variable_scope().reuse == True
    # Exited the reusing scope, back to a non-reusing one.
    assert tf.get_variable_scope().reuse == False

2.当在某一 variable_scope 内使用别的 scope 的名字时，此时不再受这里的等级关系束缚，直接与使用的 scope 的名字一样：

with tf.variable_scope("foo") as foo_scope:
    assert foo_scope.name == "foo"
with tf.variable_scope("bar")
    with tf.variable_scope("baz") as other_scope:
        assert other_scope.name == "bar/baz"
        with tf.variable_scope(foo_scope) as foo_scope2:
            assert foo_scope2.name == "foo"  # Not changed.

3.name_scope 与 variable_scope 稍有不同。name_scope 只会影响 ops 的名字，而并不会影响 variables 的名字。

with tf.variable_scope("foo"):
    with tf.name_scope("bar"):
        v = tf.get_variable("v", [1])
        x = 1.0 + v
assert v.name == "foo/v:0"
assert x.op.name == "foo/bar/add"

Tensorflow中tf.Variable()和tf.get_variable()的区别与关系

参考：https://blog.csdn.net/kevindree/article/details/86936476
从两个方法的名称上，可以简单理解一下，Variable是定义变量，而get_variable是获取变量（只不过如果获取不到就重新定义一个变量），如果按照这种逻辑，已经基本上可以理解两者的差异了。

下面我们通过一些代码，来更深入理解一下两者的差异和各自的特点

先看下面这段代码（最好先不要看下面注释中的输出结果，先自己思考一下会是什么结果）

import tensorflow as tf
 
v1 = tf.Variable(1,name="V1")                     # 第1句话
v2 = tf.Variable(2,name="V1")                     # 第2句话
v3 = tf.Variable(3,name="V1")                     # 第3句话
v4 = tf.Variable(4,name="V1_1")                   # 第4句话
 
print ("v1:",v1.name)
print ("v2:",v2.name)
print ("v3:",v3.name)
print ("v4:",v4.name)
 
v1 = tf.Variable(1,name="V1")                     # 第5句话
print ("v1:",v1.name)
 



### 输出结果为 ###
# v1: V1:0
# v2: V1_1:0
# v3: V1_2:0
# v4: V1_1_1:0
# v1: V1_3:0

这段代码表明，定义一个变量并传入参数name，如果name取值相同，则系统会自动在上一个传入的name后面加上"_n"，n从1开始顺序增加。

比如上面的第2句话和第3句话，就是因为定义变量时传入的参数name和第1句话的name取值相同（V1），因此在"V1"的基础上分别加上了"_1"和"_2"。也就是说，虽然在定义变量时，指定了相同的变量name，但是系统会创建一个新的内存空间，并自动生成一个新的name。

上面第4句话传入的参数name的取值虽然和前3句话都不相同，但是由于传入的name的取值和系统为第二句话v2自动生成的name相同，因此会在此基础上依照上述规律再加上"_1"，因此得到v4.name为V1_1_1

再看第5句话，其实和第1句话完全相同，但是系统仍然会依据上面的规则，会重新创建一个新的内存空间，并自动生成一个新的name，前面已经有了V1, V1_1, V1_2，所以这次生成的name是V1_3，因为变量仍然是v1，所以系统将v1指向了新的内存空间。

再看看如果是tf.get_variable会出现什么情况（请创建一个新的程序运行，不要与刚才的代码互相干扰）

import tensorflow as tf
 
v5 = tf.get_variable(name="V1",initializer=1)
v6 = tf.get_variable(name="V1",initializer=1)
print ("v5:",v5.name)
print ("v6:",v6.name)

你猜对了，结果会报错：ValueError: Variable V1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?

因为get_variable在没有定义变量作用域variable_scope的时候（后面会讲到）不会对get_variable()创建的name相同的变量自动进行处理，注意我上面的加粗的定语get_variable()创建的”，这也就是说，对于Variable()创建的可以是吗？让我们试一下：

import tensorflow as tf
 
v1 = tf.Variable(1,name="V1")
print ("v1:",v1.name)
 
v5 = tf.get_variable(name="V1",initializer=1)
v6 = tf.get_variable(name="V1_1",initializer=1)
print ("v5:",v5.name)
print ("v6:",v6.name)





### 输出结果如下 ###
# v1: V1:0
# v5: V1_1:0
# v6: V1_1_1:0

是不是有点出乎你的想象？系统没有报错，而是依照着上面说过的规律，对于v5创建了新的变量内存空间，并自动生成了一个变量name。而对于v6为什么也没报错呢，系统不是已经给v5生成了name=V1_1了吗？再回到上面看看我给出的加粗的定语get_variable()创建的”，因为这个V1_1的name是系统自动生成的，而不是在使用get_variable()方法时传入的，因此系统仍然可以管理好这种重名，并自动在后面加上"_1"。

可能有点绕，不过好好理解一下就可以明白。明白了上述特点，在编写程序的时候，可以避免出现很多奇怪的问题。

下面再让我们讲讲变量作用域variable_scope和共享变量的事情，先看一下代码

import tensorflow as tf
 
with tf.variable_scope("scope1"):
    v1 = tf.Variable(1, name="V1")
    v2 = tf.get_variable(name="V2", initializer=1.0)
 
with tf.variable_scope("scope1", reuse=True):
    v3 = tf.Variable(1, name="V1")
    v4 = tf.get_variable(name="V2", initializer=1.0)
 
print(v1.name)
print(v2.name)
print(v3.name)
print(v4.name)
 
print(v1 is v3, v2 is v4)
 




### 输出结果为 ###
# scope1/V1:0
# scope1/V2:0
# scope1_1/V1:0
# scope1/V2:0
# False True

通过上述代码可以看到，通过Variable()定义的变量v1和v3，虽然在相同的作用域下，并且把reuse参数设置成了True，由于传入的name都是V1，系统仍然创建了一个新的内存空间，虽然保持了变量的name不变，但是自动修改了作用域为scope1_1。

而采用get_variable()定义的变量v2和v4，由于是在相同的作用域下，并且reuse参数设置成了True，传入的参数name都是V2，则系统并没有重新创建一个变量，而是共享了这个变量，即v2==v4。

希望这些内容对大家能有所帮助！

上面这篇博客，其实我觉得名字与变量名都无所谓，其实Python最本质就是引用(作用域的转换)，而Tensorflow又是一个计算图，所以上面的代码不难理解。最让我不解的是最后一段代码，在variable_scope下混用tf.Variable、tf.get_variable，对于tf.Variable的表现很神奇，直接修改了上层作用域名，而tf.get_variable还可以正常使用。。所以这也说明了，variable_scope它本质上只是一个命名限定，不具备完备的变量作用的概念，它会根据使用不同的函数进行调整，只是Tensorflow的一种命令方法，还谈不上作用域。其实你把reuse改为false或者甚至使用tf.Variable创建一个不同的Variable，名字也可以不一样，对于tf.variable都会自动修改variable_scope的名字，官方其实也说了，variable_scope与tf.get_variable配合使用才有效果，variable_scope与tf.variable的搭配不是设计的初衷。

不过话说胡来，上面的博客已经写得很不错了，虽然不全。类似的博客还有：
https://blog.csdn.net/Jerr__y/article/details/70809528
https://blog.csdn.net/u012223913/article/details/78533910
https://blog.csdn.net/u012436149/article/details/53018924
https://www.cnblogs.com/MY0213/p/9208503.html
https://www.jianshu.com/p/ab0d38725f88

全面的角度，倒也不少，但是就不够仔细了，很多细节它们都没有讲到位，比如：
tf.AUTO_REUSE使用，get_variable_scope().reuse_variables() 的使用等。。

tf.name_scope与tf.variable_scope用法区别

参考：https://blog.csdn.net/daizongxue/article/details/84284007
本文由网络上一些回答和博文汇总而成。

要将这个问题解释清楚，得结合tensorflow中创建变量的两种方式tf.get_variable()和tf.Variable()一起说明。

在tf.name_scope下：

tf.get_variable()创建的变量名不受tf.name_scope的影响，即创建的变量的name没有name_scope定义的前缀。而且，在未指定共享变量时，如果重名会报错。

tf.Variable()会自动检测有没有变量重名，如果有则会自行处理。

要共享变量，需要使用tf.variable_scope()

解析1： https://www.zhihu.com/question/54513728/answer/515912730

对于使用tf.Variable来说，tf.name_scope和tf.variable_scope功能一样，都是给变量加前缀，相当于分类管理，模块化。
对于tf.get_variable来说，tf.name_scope对其无效，也就是说tf认为当你使用tf.get_variable时，你只归属于tf.variable_scope来管理共享与否。
来看一个例子：

with tf.name_scope('name_sp1') as scp1:
    with tf.variable_scope('var_scp2') as scp2:
        with tf.name_scope('name_scp3') as scp3:
            a = tf.Variable('a')
            b = tf.get_variable('b')

等同于：

with tf.name_scope('name_sp1') as scp1:
    with tf.name_scope('name_sp2') as scp2:
        with tf.name_scope('name_scp3') as scp3:
            a = tf.Variable('a')
 
with tf.variable_scope('var_scp2') as scp2:
        b = tf.get_variable('b')

要注意的是：

with tf.variable_scope('scp', reuse=True) as scp:
    a = tf.get_varialbe('a') #报错

和

with tf.variable_scope('scp', reuse=False) as scp:    
     a = tf.get_varialbe('a')
    a = tf.get_varialbe('a') #报错

都会报错，因为reuse=True是，get_variable会强制共享，如果不存在，报错；reuse=Flase时，会强制创造，如果已经存在，也会报错。

如果想实现“有则共享，无则新建”的方式，可以：

with tf.variable_scope('scp', reuse=tf.AUTO_REUSE) as scp:    
     a = tf.get_variable('a') #无，创造
     a = tf.get_variable('a') #有，共享

当你写一个建图模块时(e.g. LSTM_block() )，你不知道用户是否会共享此模块，因此你可以只用tf.variable_scope来分组模块内变量，用tf.get_variable来为共享提供可能，而不能用tf.Variable。

举个例子：我们想构建一个NewAutoEncoder, 包含了一个encoder和2个decoder, 这2个decoder是共享的。

首先，新建一个Dense层：

def Dense(x, x_dim, y_dim, name, reuse=None):
 
    with tf.variable_scope(name, reuse=reuse):
        w = tf.get_variable('weight', [x_dim, y_dim])
        b = tf.get_variable('bias', [y_dim])
        y = tf.add(tf.matmul(x, w), b)
    return y
 
 
def Encoder(x, name):
 
    with tf.variable_scope(name, reuse=None):
        x = tf.nn.relu(Dense(x, 784, 1000, 'layer1', reuse=False))
        x = tf.nn.relu(Dense(x, 1000, 1000, 'layer2', reuse=False))
        x = Dense(x, 1000, 10, 'layer3', reuse=False)
    return x
 
def Decoder(x, name, reuse=None):
 
    with tf.variable_scope(name, reuse=reuse):
        x = tf.nn.relu(Dense(x, 10, 1000, 'layer1', reuse=False))
        x = tf.nn.relu(Dense(x, 1000, 1000, 'layer2', reuse=False))
        x = tf.nn.sigmoid(Dense(x, 1000, 784, 'layer3', reuse=False))
    return x
 
def build_network(x):
    batchsz = 32
    x_ph = tf.placeholder(tf.float32, [batchsz, 784], name='input')
    z_ph = tf.placeholder(tf.float32, [1, 10], name='z')
 
    x = Encoder(x_ph, 'Encoder')
    x_hat = Decoder(x, 'Decoder1', reuse=None)
    x_hat2 = Decoder(z_ph, 'Decoder2', reuse=True)
 
    # ...

解读：所有的模块都要使用tf.variable_scope带name参数封装，如Dense(), Encoder(), Decoder()。对于明确不会共享的模块，如本例中的Encoder, reuse参数可以不提供。

注意：当多个tf_variable_scope嵌套时，如果中间某层开启了reuse=True, 则内层自动全部共享，即使内层设置了reuse=False。而且，一旦使用tf.get_variable_scope().reuse_variables()打开了当前域共享，就不能关闭了！

总结：

tf.variable_scope和tf.get_variable必须要搭配使用（全局scope除外），为share提供支持。
tf.Variable可以单独使用，也可以搭配tf.name_scope使用，给变量分类命名，模块化。
tf.Variable和tf.variable_scope搭配使用不伦不类，不是设计者的初衷

TensorFlow基础：共享变量 https://www.jianshu.com/p/ab0d38725f88

作用域中的resuse默认是False，调用函数reuse_variables()可设置为True，一旦设置为True，就不能返回到False，并且该作用域的子空间reuse都是True。如果不想重用变量，那么可以退回到上层作用域，相当于exit当前作用域，如

with tf.variable_scope("root"):
    # At start, the scope is not reusing.
    assert tf.get_variable_scope().reuse == False
    with tf.variable_scope("foo"):
        # Opened a sub-scope, still not reusing.
        assert tf.get_variable_scope().reuse == False
    with tf.variable_scope("foo", reuse=True):
        # Explicitly opened a reusing scope.
        assert tf.get_variable_scope().reuse == True
        with tf.variable_scope("bar"):
            # Now sub-scope inherits the reuse flag.
            assert tf.get_variable_scope().reuse == True
    # Exited the reusing scope, back to a non-reusing one.
    assert tf.get_variable_scope().reuse == False

一个作用域可以作为另一个新的作用域的参数，如：

with tf.variable_scope("foo") as foo_scope:
    v = tf.get_variable("v", [1])
with tf.variable_scope(foo_scope):
    w = tf.get_variable("w", [1])
with tf.variable_scope(foo_scope, reuse=True):
    v1 = tf.get_variable("v", [1])
    w1 = tf.get_variable("w", [1])
assert v1 is v
assert w1 is w

不管作用域如何嵌套，当使用with tf.variable_scope()打开一个已经存在的作用域时，就会跳转到这个作用域。

with tf.variable_scope("foo") as foo_scope:
    assert foo_scope.name == "foo"
with tf.variable_scope("bar"):
    with tf.variable_scope("baz") as other_scope:
        assert other_scope.name == "bar/baz"
        with tf.variable_scope(foo_scope) as foo_scope2:
            assert foo_scope2.name == "foo"  # Not changed.

这里还有两个讲的比较详细的，但是我觉得背后的技术细节并不够准确，就是关于变量作用域的问题，还有就是tf.auto_reuse与variable_scope_reuse()的问题。
http://wiki.jikexueyuan.com/project/tensorflow-zh/how_tos/variable_scope.html
https://blog.csdn.net/hancoder/article/details/90053736

之后我也翻了很多博客，我发现这两篇博客非常不错，有详细又全面：
https://blog.csdn.net/qq_22522663/article/details/78729029
https://www.cnblogs.com/MY0213/p/9208503.html
里面最重要的一句话：
当reuse为False或者None时（这也是默认值），同一个tf.variable_scope下面的变量名不能相同；当reuse为True时，tf.variable_scope只能获取已经创建过的变量

OK，相信看了上面博客，基本上已经能全部弄明白tf.Variable、tf.get_variable、tf.variable_scope与tf.name_scope的关系与区别了。但是美中不足的是，我们还没解决tf.auto_resue与tf.get_variable_scope().reuse_variables()，这两个函数增加了使用的灵活性。
但是需要注意几点：