Tensorflow 常用方法总结（持续更新）

最新推荐文章于 2024-08-25 18:49:06 发布

woaipichuli

最新推荐文章于 2024-08-25 18:49:06 发布

阅读量2.4k

点赞数 3

分类专栏： Tensorflow 文章标签： Tensorflow API

本文链接：https://blog.csdn.net/woaipichuli/article/details/78612946

版权

Tensorflow 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

最近入手Tensorflow，准备进行深度强化方面的学习和实践，发现网上给出的Tensorflow的资料过于散乱。加上购买的资料往往以工程代码分析展开，使得学习的过程变得非常的缓慢。因此，考虑更新一篇博文来记录在学习的过程中，接触到的一些常用的函数方法。

1-tf.InteractiveSession()

sess = tf.InteractiveSession()

这里通过InteractiveSession创建了一个session，该命令会将创建的session注册成为默认的session。在Tensorflow中运算在没有指定会话的情况下，总是运行在程序的默认session里，不同的session之间的运算和数据时相互独立的。

2-tf.Placeholder()

tf.placeholder(dtype, shape=None, name=None)

创建一个需要填入数据的Tensorflow节点，在数据未填入之前，整个Tensorflow会出于挂起的状态。该节点不能进行评估否则将会抛出错误，通常使用字典的方式进行填入feed_dict。

a. dtype 变量的数据类型
b. shape 填入数据的维度，通常第一维设置为None表示不限制输入的样本数目。
c. name 创建数据节点的名字

3-tf.Variable()

W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

创建用来存储模型参数的变量，在模型的训练过程中持久化的被保留（比如存储在显存中）。同时，对于遇见同名的变量，Variable函数会自动作出处理
这里写图片描述

4-tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None)

Tensorflow中的另一种变量生成方式，不同于Variable的创建方式，get_variable可以通过name来生成新的变量或者查找导出同名变量。但是如果试图创建的变量已经存在，程序将会抛出错误。

import tensorflow as tf

w_1 = tf.get_variable(name="w_1",initializer=1)
w_2 = tf.get_variable(name="w_1",initializer=2)

错误信息 #ValueError: Variable w_1 already exists, disallowed. Did #you mean to set reuse=True in VarScope?

import tensorflow as tf

with tf.variable_scope("scope1"):
    w1 = tf.get_variable("w1", shape=[])
    w2 = tf.Variable(0.0, name="w2")
with tf.variable_scope("scope1", reuse=True):
    w1_p = tf.get_variable("w1", shape=[])
    w2_p = tf.Variable(1.0, name="w2")

print(w1 is w1_p, w2 is w2_p)

输出 #True False

5-tf.variable_scope(name)&tf.namespace(name)

在Tensorflow中怎么对变量进行区分?主要依赖于使用tf.variable_scope以及tf.namespace进行完成。类似于C++当中对于同名函数或者变量进行区别的办法（引入命名空间），这两个函数可以为相应的变量添加名字前缀。
可以从三点上进行总结:
1-使用tf.Variable()创建的变量，它的命名受到tf.variable_scope以及tf.namespace的共同影响。
2-使用tf.get_variable()创强调内容建的变量，它的名字仅收到tf.variable_scope强调内容的影响，而不受tf.name_scope的影响。
3-这种名字前缀的区分方法具有继承性，会继承到区间内的其他子前缀区间上，也可以利用这一特性进行初始化方式的统一或者变量的共享复用上。

with tf.variable_scope("root"):

#At start, the scope is not reusing.

assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo"):
# Opened a sub-scope, still not reusing.
assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo", reuse=True):
# Explicitly opened a reusing scope.
assert tf.get_variable_scope().reuse == True
with tf.variable_scope("bar"):
# Now sub-scope inherits the reuse flag.
assert tf.get_variable_scope().reuse == True
# Exited the reusing scope, back to a non-reusing one.
assert tf.get_variable_scope().reuse == False

with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)):
    v = tf.get_variable("v", [1])
    assert v.eval() == 0.4 # Default initializer as set above.
    w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3)):
    assert w.eval() == 0.3 # Specific initializer overrides the default.
    with tf.variable_scope("bar"):
        v = tf.get_variable("v", [1])
        assert v.eval() == 0.4 # Inherited default initializer.
    with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)):
        v = tf.get_variable("v", [1])
        assert v.eval() == 0.2 # Changed default initializer.

在不同的会话里，变量和操作是不同的；在相同的会话里，不同名称的变量或操作是不同的。

6-tf.reduce_X()
Tensorflow中对变量某一维度进行求值的函数，例如:

#求最大值
tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)
#求最小值
tf.reduce_min(input_tensor, reduction_indices=None, keep_dims=False, name=None)
#求平均值
tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)
#求累和
tf.reduce_sum(input_tensor, reduction_indices=None, keep_dims=False, name=None)
#求累乘
tf.reduce_prod(input_tensor, reduction_indices=None, keep_dims=False, name=None)

a.input_tensor: 待求值的张量
b.reduction_indices: 进行求解的维度
c.keep_dims:如果为True,那么指定维度中的元素经过处理以后返回一个秩为1的张量，否则返回一个累加的标量。
d.name:为该操作进行命名

import tensorflow as tf
import numpy as np

a = tf.constant(np.random.rand(3,4))
c1 = tf.reduce_sum(a, 1, keep_dims = False)
c2 = tf.reduce_sum(a, 1, keep_dims = True)
sess = tf.Session()
print(sess.run(c1))
print(sess.run(c2))
sess.close()
#c1=[ 0.75098078  1.70599565  1.8444463 ]
#c2=[[ 0.75098078]
     [ 1.70599565]
     [ 1.8444463 ]]

7-tf.train.GradientDecentOptimizer()

在做深度学习模型训练过程当中运用最多的优化函数方法，Tensorflow当中还提供了许多其他的方法，只需要对函数名进行修改即可。

train_step = tf.train.GradientDecentOptimizer(learning_rate).minimize(target)

8-gradient

在有些模型或者算法过程中，我们会需要用到其他变量的梯度信息（例如DDPG算法），Tensorflow也为我们提供了求解目标问题梯度信息的函数。

#tf.gradients(ys, xs, grad_ys=None, name=gradients, colocate_gradients_with_ops=False, #gate_gradients=False, aggregation_method=None)

import tensorflow as tf

w1 = tf.Variable([[1,2]])
w2 = tf.Variable([[3,4]])

res = tf.matmul(w1, [[2],[1]])

grads = tf.gradients(res,[w1])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    re = sess.run(grads)
    print(re)
#  [array([[2, 1]], dtype=int32)]

a.ys 目标值
b.xs 参数值
c.grad_ys 外部引入的梯度值（与所求出的梯度进行点乘）

import tensorflow as tf

w1 = tf.get_variable('w1', shape=[3])
w2 = tf.get_variable('w2', shape=[3])

w3 = tf.get_variable('w3', shape=[3])
w4 = tf.get_variable('w4', shape=[3])

z1 = w1 + w2+ w3
z2 = w3 + w4

grads = tf.gradients([z1, z2], [w1, w2, w3, w4], grad_ys=[tf.convert_to_tensor([2.,2.,3.]),
                                                          tf.convert_to_tensor([3.,2.,4.])])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    print(sess.run(grads))

 #Output:
#[array([ 2.,  2.,  3.],dtype=float32), [1,0]×[[2.,2.,3.],[3.,2.,4.]]
 #array([ 2.,  2.,  3.], dtype=float32), 
 #array([ 5.,  4.,  7.], dtype=float32), [1,1]×[[2.,2.,3.],[3.,2.,4.]]
 #array([ 3.,  2.,  4.], dtype=float32)]

9-apply_gradients

有时我们希望累计统计一定的梯度信息以后再对模型进行更新，这时候就可以考略使用apply_gradients对可训练变量进行调整。

---------------tensorflow中graph定义-------------------------
####模型定义
observations = tf.placeholder(tf.float32, [None,D] , name="input_x")
W1 = tf.get_variable("W1", shape=[D, H],
           initializer=tf.contrib.layers.xavier_initializer())
layer1 = tf.nn.relu(tf.matmul(observations,W1))
W2 = tf.get_variable("W2", shape=[H, 1],
           initializer=tf.contrib.layers.xavier_initializer())
score = tf.matmul(layer1,W2)
probability = tf.nn.sigmoid(score)
####模型定义

####获取可训练变量，定义记录梯度的缓冲区
tvars = tf.trainable_variables()
W1Grad = tf.placeholder(tf.float32,name="batch_grad1") # Placeholders to send the final gradients through when we update.
W2Grad = tf.placeholder(tf.float32,name="batch_grad2")
batchGrad = [W1Grad,W2Grad]
####获取可训练变量，定义记录梯度的缓冲区

####应用缓冲区中的梯度更新变量
adam = tf.train.AdamOptimizer(learning_rate=learning_rate) # Our optimizer
updateGrads = adam.apply_gradients(zip(batchGrad,tvars))
####应用缓冲区中的梯度更新变量

####利用gradient函数计算梯度信息，出来的结果是个多维向量
newGrads = tf.gradients(loss,tvars)
####利用gradient函数计算梯度信息，出来的结果是个多维向量
---------------tensorflow中graph定义-------------------------



---------------实际计算中的调用-------------------------------
####计算梯度信息，记录缓冲区，满足条件后更新模型参量，清空缓冲区内容
tGrad = sess.run(newGrads,feed_dict={observations: epx, input_y: epy, advantages: discounted_epr})
for ix,grad in enumerate(tGrad):
    gradBuffer[ix] += grad
if episode_number % batch_size == 0: 
    sess.run(updateGrads,feed_dict={W1Grad: gradBuffer[0],W2Grad:gradBuffer[1]})
    for ix,grad in enumerate(gradBuffer):
        gradBuffer[ix] = grad * 0
####计算梯度信息，记录缓冲区，满足条件后更新模型参量，清空缓冲区内容
---------------实际计算中的调用-------------------------------

10-tf.stop_gradient()

该函数可以阻止对某一节点计算前的相关变量的梯度计算，但是不影响与该节点无关的变量的梯度。

import tensorflow as tf

w1 = tf.Variable(2.0)
w2 = tf.Variable(2.0)

a = tf.multiply(w1, 3.0)
a_stoped = tf.stop_gradient(a)

# b=w1*3.0*w2
b = tf.multiply(a_stoped, w2)
gradients1 = tf.gradients(b, xs=[w1, w2])
gradients2 = tf.gradients(b, xs=[w2])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    #g1=sess.run(gradients1) can not be conducted because gradient1 is None
    g2=sess.run(gradients2) 
    print(g2)
#Output
#[6.0]

高阶导数：可以通过多次求解一阶导数来完成

import tensorflow as tf

w1 = tf.Variable(2.0)
w2 = tf.Variable(2.0)

a = tf.multiply(w1, 3.0)
a_stoped = tf.stop_gradient(a)

# b=w1*3.0*w2
b_ = tf.multiply(a_stoped, w2)
b = tf.multiply(b_, w2)
gradients1 = tf.gradients(b, xs=[w2])
gradients2 = tf.gradients(gradients1, xs=[w2])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    g1=sess.run(gradients1)
    g2=sess.run(gradients2)
    print(g1)
    print(g2)
 #Output
 #[24.0]    12w2  显然这里tensorflow将w2的设定值带入了计算，因此模型运行前必须先初始化
 #[12.0]    12

11-BasicLSTMCell

在Tensorflow当中集成了不少的深度学习的模块，比如传统的RNN，LSTM等。

 lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(size, forget_bias=0.0, state_is_tuple=True)

在LSTM单元中，有2个状态值，分别是c和h，分别对应于下图中的c和h。其中h在作为当前时间段的输出的同时，也是下一时间段的输入的一部分。
这里写图片描述
那么当state_is_tuple=True的时候，state是元组形式，state=(c,h)。如果是False，那么state是一个由c和h拼接起来的张量，state=tf.concat(1,[c,h])。在运行时，则返回2值，一个是h，还有一个state

11-MultiRNNcell()
我们知道在RNN,LSTM当中，数据的流向形成了一个回路，上一时间的输出将成为下一时间的输入。在Tensorflow当中我们只需要把定义好的基础模块（cell）放入到MultiRNNcell当中就可以了，生成的模型会自动的完成这一数据回路，我们只要把不同时间节点上的数据往模型填入就可以了。

cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * config.num_layers, state_is_tuple=True) 

# 参数初始化,rnn_cell.RNNCell.zero_stat
self._initial_state = cell.zero_state(batch_size, data_type())

outputs = []
state = self._initial_state # state 表示 各个batch中的状态
with tf.variable_scope("RNN"):
    for time_step in range(num_steps):
        if time_step > 0: tf.get_variable_scope().reuse_variables()
        # cell_out: [batch, hidden_size]
        (cell_output, state) = cell(inputs[:, time_step, :], state) # 按照顺序向cell输入文本数据
        outputs.append(cell_output)  # output: shape[num_steps][batch,hidden_size]

# 把之前的list展开，成[batch, hidden_size*num_steps],然后 reshape, 成[batch*numsteps, hidden_size]
output = tf.reshape(tf.concat(1, outputs), [-1, size])

12-RNN按时间循环常用函数
从11中的代码，我们可以看到MultiRNNcell()能够把RNN的各个子模块进行连接，再按照时间进行数据填入，那有没有更为简单的办法来完成这样的数据填入呢？在Tensorflow当中准备了几种数据填入的封装函数：
这里写图片描述
这样的话，我们只需要在代码中把连接好的grath以及准备好的数据给入到函数就可以完成我们的填入任务。

#coding=utf-8
import tensorflow as tf
import numpy as np
# 创建输入数据
X = np.random.randn(2, 10, 8)

# 第二个example长度为6
X[1,6:] = 0
X_lengths = [10, 6]

cell = tf.contrib.rnn.BasicLSTMCell(num_units=64, state_is_tuple=True)

outputs, last_states = tf.nn.dynamic_rnn(
    cell=cell,
    dtype=tf.float64,
    sequence_length=X_lengths,
    inputs=X)

result = tf.contrib.learn.run_n(
    {"outputs": outputs, "last_states": last_states},
    n=1,
    feed_dict=None)

print result[0]

assert result[0]["outputs"].shape == (2, 10, 64)

# 第二个example中的outputs超过6步(7-10步)的值应该为0
assert (result[0]["outputs"][1,7,:] == np.zeros(cell.output_size)).all()