【TensorFlow】变量管理tf.get_variables/tf.Variable/tf.variable_scope的应用（七）

最新推荐文章于 2021-06-19 16:55:33 发布

brucewong0516

最新推荐文章于 2021-06-19 16:55:33 发布

阅读量6.5k

点赞数 3

分类专栏：深度学习 TensorFlow 文章标签：深度学习 TensorFlow 变量管理上下文管理器

本文链接：https://blog.csdn.net/brucewong0516/article/details/78788772

版权

深度学习同时被 2 个专栏收录

29 篇文章 4 订阅

订阅专栏

TensorFlow

13 篇文章 0 订阅

订阅专栏

声明：参考自TensorFlow实战

TensorFlow中通过变量名称获取变量的机制主要是通过tf.get_variables/tf.Variable/tf.variable_scope函数实现的，且通过tf.get_variables()和tf.Variable()来创建变量是等价的。

变量作用域机制在TensorFlow中主要由两部分组成：

tf.get_variable(name, shape, initializer): 通过所给的名字创建或是返回一个变量.
tf.variable_scope(scope_name): 通过 tf.get_variable()为变量名指定命名空间.

tf.get_variable()和tf.Variable()最大的区别在于指定变量名称的参数。tf.Variable()的变量名称是一个可选参数，通过name = ” 给出，但是tf.get_variable()的变量名称是一个必填参数。如果tf.get_variable()获取一个已经创建的变量，需要通过tf.variable_scope()函数生成一个上下文管理器，并明确指定tf.get_variable()将直接获取已经生成的变量。

import tensorflow as tf
#获取变量的方式主要有以下两种，实践中tf.get_variable产生的变量一定要搭配tf.variable_scope使用，不然运行脚本会报错
v = tf.get_variable('v',shape= [1],initializer = tf.constant_initializer(1.0))
#使用直接定义变量不会报错，可以一直调用
vc = tf.Variable(tf.constant(1.0,shape = [1]),name = 'v')
print(vc)
#以下使用with语法，将tf.get_variable与tf.variable_scope搭配使用,且reuse=True时，之前必须定义V
with tf.variable_scope('',reuse = True):
    v = tf.get_variable('v',shape= [1],initializer = tf.constant_initializer(1.0))
    print(v)
    v1 = tf.get_variable('v',shape= [1],initializer = tf.constant_initializer(1.0))
    print(v1==v)

tf.get_variable函数调用时提供的维度（shape）信息以及初始化方法（initializer ）的参数和tf.Variable函数调用时提供的初始化过程中的参数也类似。

TensorFlow提供的initializer 初始化函数和随机数和常量生成函数是一一对应的：

tf.constant_initializer:将变量初始化为给定的常量
tf.random_normal_initializer：将变量初始化为满足正态分布的随机值
tf.truncated_normal_initializer：将变量初始化为满足正态分布的随机值，但是如果随机值出来的偏离程度超过两个标准差，则重新随机。
tf.random_uniform_initializer：将变量初始化为满足平均分布的随机值
tf.zeros_initializer：将变量全设置为0
tf.ones_initializer：将变量全设置为1

import tensorflow as tf
#在名字为foo的命名空间内创建名字为v的变量
with tf.variable_scope("foo"):
    #创建一个常量为1的v
    v= tf.get_variable('v',[1],initializer = tf.constant_initializer(1.0))
#因为在foo空间已经创建v的变量，所以下面的代码会报错
#with tf.variable_scope("foo"）:
#   v= tf.get_variable('v',[1])
#在生成上下文管理器时，将参数reuse设置为True。这样tf.get_variable的函数将直接获取已声明的变量
#且调用with tf.variable_scope("foo"）必须是定义的foo空间，而不能是with tf.variable_scope(""）未命名或者其他空间。
with tf.variable_scope("foo",reuse =True):
    v1= tf.get_variable('v',[1])
    print(v1==v) #输出为True，代表v1与v是相同的变量

通过tf.variable_scope函数可以控制tf.get_variable函数的语义。当reuse = True时，这个上下文管理器内所有的tf.get_variable都会直接获取已经创建的变量。如果变量不存在，则会报错。相反，如果reuse = None或者reuse = False，tf.get_variable将创建新的变量，，若同名的变量已经存在则报错。

tf.variable_scope嵌套样例：

import tensorflow  as tf
with tf.variable_scope('root'):
    #通过tf.get_variable_scope().reuse来获取reuse参数的取值
    print(tf.get_variable_scope().reuse) #False
    with tf.variable_scope('foo',reuse= True):
        print(tf.get_variable_scope().reuse) #True
        with tf.variable_scope('bar'):
            print(tf.get_variable_scope().reuse) #True，不指定reuse时，这个取值和上面一层保持一致。
    print(tf.get_variable_scope().reuse) #False，退出reuse设置为True的上下文之后，又变为false

tf.variable_scope函数生成的上下文管理器也会创建一个TensorFlow中的命名空间，在这个命名空间内创建的变量名称都会带上这个空间名作为前缀，因此tf.variable_scope函数可以管理变量命名空间。

import tensorflow as tf
with tf.variable_scope("bar"):
    v0= tf.get_variable('v',[1],initializer = tf.constant_initializer(1.0))  
with tf.variable_scope("bar",reuse =True):
    v2= tf.get_variable('v',[1])    
    print(v2.name) #输出带有空间名：bar\v:0
#根据嵌套规则, tf.variable_scope函数是可以嵌套使用的。嵌套的时候，若某层上下文管理器未声明reuse参数，则该层上下文管理器的reuse参数与其外层保持一致。同理空间名称也是符合该规律。

通过tf.variable_scope和tf.get_variable函数，对前向传播算法做修改。

原算法：

#生成隐藏层的参数
weights1 = tf.Variable(tf.truncated_normal([INPUT_NODE,LAYER1_NODE],stddev = 0.1))
biases1 = tf.Variable(tf.constant(0.1,shape=[LAYER1_NODE]))

#生成输出层的参数
weights2 = tf.Variable(tf.truncated_normal([LAYER1_NODE,OUTPUT_NODE],stddev = 0.1))
biases2 = tf.Variable(tf.constant(0.1,shape=[OUTPUT_NODE]))

def inference(input_tensor,avg_class,weights1,biases1,weights2,biases2):
    #当没有提供滑动平均类是，直接使用参数当前的取值
    if avg_class == None:
        #计算隐藏层的前向传播结果，这里使用了ReLU激活函数
        layer1 = tf.nn.relu(tf.matmul(input_tensor,weights1)+biases1)
        #计算输出层的前向传播结果，因为在计算损失函数时会一并计算softmax函数，所以这里不需要加入激活函数。而且不加入softmax不会影响预测结果。
        #因为预测时使用的是不用于对应节点输出值的相对大小，有没有softmax层对最后的分类结果的计算没有影响。于是在计算整个神经网络的前向传播时
        #可以不加最后的softmax层。
        return tf.matmul(layer1,weights2)+biases2
    #否则，使用滑动平均值
    else:
        #首先使用avg_class.average函数来计算得出变量的滑动平均值。
        #然后再计算相应的神经网络前向传播的结果。
        layer1 = tf.nn.relu(tf.matmul(input_tensor,weights1)+avg_class.average(biases1))
        return tf.matmul(layer1,avg_class.average(weights2))+avg_class.average(biases2)

改进算法：

'''
with tf.variable_scope('boo'):
    v = tf.get_variable('v',[1],initializer = tf.constant_initializer(1.0))
    print(v.name)
with tf.variable_scope('koo'):
    v = tf.get_variable('v',[1],initializer = tf.constant_initializer(1.0))
    print(v.name)
#根据不同的空间生成的变量是不一样的
'''
def inference(input_tensor,reuse= False):
    #根据传进去的reuse来判断是创建好的新变量还是使用自己已经创建好的。在第一次构建网络时需要创建新的变量，以后每次调用这个函数都直接用reuse= True就不需要每次将变量传进来。
    with tf.variable_scope('layer1',reuse = reuse):
        weights = tf.get_variable('weights',[INPUT_NODE,LAYER1_NODE],initializer = tf.truncated_normal_initializer(stddev = 0.1))
        biases = tf.get_variable('biases ',[LAYER1_NODE],initializer = tf.constant_initializer(0.0))
        layer1 = tf.nn.relu(tf.matmul(input_tensor,weights)+biases )
    with tf.variable_scope('layer2',reuse = reuse):
        weights = tf.get_variable('weights',[LAYER1_NODE,OUTPUT_NODE],initializer = tf.truncated_normal_initializer(stddev = 0.1))
        biases = tf.get_variable('biases ',[OUTPUT_NODE],initializer = tf.constant_initializer(0.0))
        layer2 = tf.nn.relu(tf.matmul(layer1 ,weights)+biases)
    return layer2
x = tf.placeholder(tf.float32,[None,INPUT_NODE],name = 'x-input')
y = inference(x)

new_x = ...
new_y = inference(new_x ,True)

使用上述代码就不需要将所有的变量作为参数传递到不同的函数中。当神经网络更加复杂、参数更多时，使用这样的方式会大大提高程序的可读性。

以上均是单独实现，主要是由于没有更改空间名，直接一起运行会报错，下面提供完整版的范例：

import tensorflow as tf  

# 在名字为foo的命名空间内创建名字为v的变量  
with tf.variable_scope("foo"):  
    v = tf.get_variable("v", [1], initializer=tf.constant_initializer(1.0))  

# 因为命名空间foo内已经存在变量v，再次创建则报错 
with tf.variable_scope("foo"): 
    v = tf.get_variable("v", [1]) 
# ValueError: Variable foo/v already exists, disallowed. 
# Did you mean to set reuse=True in VarScope? 

# 将参数reuse参数设置为True，则tf.get_variable可直接获取已声明的变量  
with tf.variable_scope("foo", reuse=True):  
    v1 = tf.get_variable("v", [1])  
    print(v == v1) # True  


# 当reuse=True时，tf.get_variable只能获取指定命名空间内的已创建的变量 
with tf.variable_scope("bar", reuse=True): 
    v2 = tf.get_variable("v", [1]) 
# ValueError: Variable bar/v does not exist, or was not created with 
# tf.get_variable(). Did you mean to set reuse=None in VarScope? 


with tf.variable_scope("root"):  
    # 通过tf.get_variable_scope().reuse函数获取当前上下文管理器内的reuse参数取值  
    print(tf.get_variable_scope().reuse) # False  

    with tf.variable_scope("foo1", reuse=True):  
        print(tf.get_variable_scope().reuse) # True  

        with tf.variable_scope("bar1"):  
            # 嵌套在上下文管理器foo1内的bar1内未指定reuse参数，则保持与外层一致  
            print(tf.get_variable_scope().reuse) # True  

    print(tf.get_variable_scope().reuse) # False  

# tf.variable_scope函数提供了一个管理变量命名空间的方式  
u1 = tf.get_variable("u", [1])  
print(u1.name) # u:0  
with tf.variable_scope("foou"):  
    u2 = tf.get_variable("u", [1])  
    print(u2.name) # foou/u:0  

with tf.variable_scope("foou"):  
    with tf.variable_scope("baru"):  
        u3 = tf.get_variable("u", [1])  
        print(u3.name) # foou/baru/u:0  

    u4 = tf.get_variable("u1", [1])  
    print(u4.name) # foou/u1:0  

# 可直接通过带命名空间名称的变量名来获取其命名空间下的变量  
with tf.variable_scope("", reuse=True):  
    u5 = tf.get_variable("foou/baru/u", [1])  
    print(u5.name)  # foou/baru/u:0  
    print(u5 == u3) # True  
    u6 = tf.get_variable("foou/u1", [1])  
    print(u6.name)  # foou/u1:0  
    print(u6 == u4) # True

最后注意点：以上均在spider脚本运行，所以如果重复运行，会报错：

ValueError: Variable foo/v does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

处理的方式就是每运行完脚本后，restart kernel重启运算核。至于缘由，目前还不太清楚，应该是脚本运行完后，变量会储存在这个脚本里面，再次运行就会出现错误。

brucewong0516

关注

3
点赞
踩
15

收藏

觉得还不错? 一键收藏
5
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录