输入import tensorflow as tf应该出现_对tensorflow中graph(ops)和session的理解

最新推荐文章于 2024-02-08 20:12:43 发布

weixin_39960700

最新推荐文章于 2024-02-08 20:12:43 发布

阅读量322

点赞数

文章标签：输入import tensorflow as tf应该出现

本文详细介绍了TensorFlow的工作原理，包括计算图的概念，如何通过会话执行计算，以及变量的使用。计算图是TensorFlow的核心，它描述了计算的流程；会话负责执行这些计算并分配资源。此外，文章还讨论了占位符、变量初始化、计算路径以及线性回归模型的应用，强调了在训练和预测中变量的不同行为。

摘要由CSDN通过智能技术生成

许多python库是python的扩展。当你import一个库时，你可以得到一组变量，函数，类，并且使用它们来完成你的代码。代码完成之后，基本上你可以预测到代码是如何执行的。但是在使用TF时，当你试图理解tensorflow是什么，以及tensorflow如何与代码进行交互时，这些认知是不对的。

TF是一个描述抽象计算（computation graphs）的架构。当我们用python来操作TF时，需要做两件事：

使用python来构建一个计算图（assemble a computation grpah)
使用tf.session来与计算图进行交互

1，计算图（computation graph）

计算图本质上是一个全局的数据结构。他是一个有向图用来描述如何完成一次计算。

import tensorflow as tf
two_node = tf.constant(2, name='two_node')
print(two_node)
[n.name for n in tf.get_default_graph().as_graph_def().node]

# 输出
Tensor("two_node:0", shape=(), dtype=int32)
['two_node']

执行上述代码时，tf.constant()这句python代码生成了一个类型为Tensor的graph node。

two_node = tf.constant(2, name='two_node')
another_two_node = tf.constant(2, name='two_node')
two_node = tf.constant(2, name='two_node')
tf.constant(3, name='two_node')
[n.name for n in tf.get_default_graph().as_graph_def().node]

# 输出
['two_node', 'two_node_1', 'two_node_2', 'two_node_3', 'two_node_4']

上述代码又生成了4个类型为Tensor的graph node。虽然有两个tf.contant()函数都把返回值赋值给了同一个变量two_node。每次调用tf.constant()都会生成一个node。这些node用x.name来进行唯一标示。不关他的返回值是否赋给不同的变量，或者不赋值给变量。

copy_two_node = two_node
copy_another_two_node = another_two_node
[n.name for n in tf.get_default_graph().as_graph_def().node]

# 输出
['two_node', 'two_node_1', 'two_node_2', 'two_node_3', 'two_node_4']

上述描述tf.constant()返回值（可以理解为node的指针）之间的操作。这种操作只是复制一个node指针给另一个变量，并不会改变计算图。

# 需要reset runtime
import tensorflow as tf
two_node = tf.constant(2)
three_node = tf.constant(3)
sum_node = two_node + three_node
print(two_node)
print(three_node)
print(sum_node)
[n.name for n in tf.get_default_graph().as_graph_def().node]

# 输出
Tensor("Const:0", shape=(), dtype=int32)
Tensor("Const_1:0", shape=(), dtype=int32)
Tensor("add:0", shape=(), dtype=int32)
['Const', 'Const_1', 'add']

上述代码在tf的computation graph图中增加了三个node。其中两个node是赋值操作，一个node是算数操作。

这里“+”好像是python的操作方法，但实际上TF重载了“+”操作。另外这里只是生成了计算图，但是并没有真正的计算。

2，会话（session）

会话的作用是分配内存和优化，从而使得计算图中定义的操作可以被执行。计算图可以理解成“计算的模板”，他描述完成计算的步骤。session用来执行计算图，用来生成真正的结果。要使用TF来进行计算，我们既需要graph也需要session。

例子：

# reset all runtimes
import tensorflow as tf
two_node = tf.constant(2)
three_node = tf.constant(3)
sum_node = two_node + three_node
sess = tf.Session()
print(sess.run(sum_node))

# 输出
5

例子:

# reset all runtimes
import tensorflow as tf
two_node = tf.constant(1)
three_node = tf.constant(100)
print(two_node.name)
sum_node = two_node + three_node

two_node = tf.constant(2)
print(two_node.name)
three_node = tf.constant(3)
sess = tf.Session()
print(sess.run(sum_node))

[n.name for n in tf.get_default_graph().as_graph_def().node]

# 输出
Const:0
Const_2:0
101
['Const', 'Const_1', 'add', 'Const_2', 'Const_3']

这个例子很有意思。这里sum_node = two_node + three_node是关联的tf.constant(1) 与 tf.constant(100)两个node。而不是tf.constant(2) 与 tf.constant(3)。

3，placehoder和feed_dict

到目前为止的计算都没有太大意思。因为没有机会从外部输入一个值并完成计算。更常见的TF应用是，从外部接收一个值作为计算图的输入。

placeholder是从外部接收数据的最直白的方式。tf.placeholder(dtype, shape)会在计算图上创建一个数据接收节点。带有输入数据节点的计算图通过通过tf.run(操作节点, feed_dict={数据接收节点:value})来接收数据，并执行操作节点定义的操作。

例子：

# reset all runtimes
import tensorflow as tf
two_node = tf.placeholder(tf.float32)
three_node = tf.constant(2.0)
sum_node = two_node + three_node

sess = tf.Session()
print(sess.run(sum_node, feed_dict={two_node:1.5}))

# 输出
3.5

例子：

# reset all runtimes
import tensorflow as tf
two_node = tf.placeholder(tf.float32)
three_node = tf.placeholder(tf.float32)
sum_node = two_node + three_node

sess = tf.Session()
print(sess.run(sum_node, feed_dict={two_node:1.5,three_node:6.6}))

# 输出8.1

4，计算路径（computation path）

当使用sess.run()来执行一个操作时，TF会检查这个操作的所有依赖变量。只有当所有的变量都没有问题时，操作才会顺利执行下去。这里的依赖关系，就是计算路径。

5，变量

理解变量对使用TF进行深度学习非常重要。TF中的变量在不同情况下会有不同的要求。比如说在训练时，我们希望每一个batch都更新参数。但是在预测时，我们希望参数不变。

TF中的变量（tf.get_variable() or tf.Variable()）跟tf.constant(),tf.placeholder()一样，都是无依赖node。tf.get_variable()的前两个参数name和shape是必须的，其他是可选的。

# 创建一个[3, 8]的矩阵
v=tf.get_variable('v', shape=[3, 8])
# 创建一个标量
v=tf.get_variable('v', shape=[])

例子：

import tensorflow as tf
v = tf.get_variable('v', shape=[])
sess = tf.Session()
sess.run(v)

# 输出
FailedPreconditionError: Attempting to use uninitialized value v
	 [[{{node _retval_v_0_0}}]]

上述代码报错是因为，当使用tf.get_variable()来创建一个变量时，变量的值为NULL。所有试图对NULL变量的操作都会报错：Attempting to use uninitialized value v。

变量的初始化有两种方式：

tf.assgin(dist_node, src_node)
tf.global_variables_initializer() 与tf.get_variable(initializer=)配合使用。

例子：

# reset all runtimes
# 使用tf.assgin(dist_node, src_node)
import tensorflow as tf
v         = tf.get_variable('v', shape=[])
zero_node = tf.constant(2.0)
init_v    = tf.assign(v, zero_node)
sess      = tf.Session()
sess.run(init_v)

# 输出2.0

这里要注意的是，虽然v节点跟操作节点init_v在计算图是连接的。但是他们之间并不存在依赖关系。为什么？？？

# reset all runtimes
# 使用tf.global_variables_initializer() 
import tensorflow as tf
# 先定义一个初始化方法
const_init_node = tf.constant_initializer( 100.100)
c_v  = tf.get_variable('cc', [], initializer=const_init_node)
init = tf.global_variables_initializer()
sess = tf.Session()
# 先执行初始化，再打印变量的值
sess.run(init)
print(sess.run(c_v))

# 输出
100.1

tf.global_variables_initializer()会自动检查所有的变量，初始化所有存在initializer的变量。

6，变量共享

不推荐变量共享。

7，用一个线性回归的例子完成总结

import tensorflow as tf
import random

# 设置变量
true_w = 3.
true_b = 1.
with tf.name_scope('r') as scope:
  w = tf.get_variable('weights', [], initializer=tf.constant_initializer(0.))
  b = tf.get_variable('biases', [], initializer=tf.constant_initializer(0.))
  # 设置 输入输出 占位符
  input_  = tf.placeholder(tf.float32)
  output_ = tf.placeholder(tf.float32)
init_v = tf.global_variables_initializer()

# 设置计算操作
guess_o = w * input_ + b 
loss = tf.square(output_ - guess_o)
optimizer = tf.train.GradientDescentOptimizer(1e-3)
train_op = optimizer.minimize(loss)

# 执行计算图
# 先初始化变量
sess = tf.Session()
sess.run(init_v)

for i in range(100000):
  input_d = random.random()
  output_d = true_w * input_d + true_b
  # Debug
  #sess.run(tf.print([w, b]))
  loss_, _ = sess.run([loss, train_op], feed_dict={input_:input_d,output_:output_d})
  #print(input_d, output_d)
  #print(loss_)
print(sess.run([w, b]))


# 输出
[2.999884, 1.0000602]

使用tf.print([x,y,z,q])来进行debug

tf.print([x,y,z,q])会生成一个TF node。需要调用sess.run(tf.print([]))来完成打印。

需要注意的一点是在tf.run(训练)之前所有的中间变量都是不存在，在tf.run(训练)之后，所有的中间变量都会被销毁。所以在训练之前和之后打印中间变量都是不可能。只能在计算图执行时，才能打印涉及到的中间变量。

注意loss_不是中间变量。他是sess.run()的返回值。

参考资料：

https://jacobbuckman.com/2018-06-25-tensorflow-the-confusing-parts-1/