[翻译]斯坦福CS 20SI:基于Tensorflow的深度学习研究课程笔记,Lecture note 2: TensorFlow Ops

最新推荐文章于 2024-07-05 11:51:13 发布

wangyuweihx

最新推荐文章于 2024-07-05 11:51:13 发布

阅读量6k

点赞数

文章标签：深度学习 tensorflow 斯坦福

“CS 20SI: TensorFlow for Deep Learning Research” Prepared by Chip Huyen
Reviewed by Danijar Hafner

Lecture note 2: TensorFlow Ops
个人翻译，部分内容较简略，建议参考原note阅读

1. tensorboard使用

示例代码：

import  tensorflow  as  tf a  =  tf.constant ( 2)
b  =  tf . constant ( 3)
x  =  tf . add ( a ,  b)
with  tf.Session ()   as  sess: print  sess.run ( x)

训练之前，建立graph以后，运行下列代码激活tensorboard：

writer  =  tf.summary.FileWriter( logs_dir ,  sess . graph)

将你的event存储在logs_dir中，然后在terminal中输入tensorboard --logdir=logs_dir 打开tensorboard

这里写图片描述

a,b,x为代码中使用的变量名，tensorboard中使用name=# 给节点命名，例如：

a  =  tf.constant([ 2 ,   2 ],  name = "a") 
b  =  tf.constant([ 3 ,   6 ],  name = "b") 
x  =  tf.add( a ,  b ,  name = "add")

2. constant类型

示例：

tf.constant( value ,  dtype = None ,  shape = None ,  name = 'Const' ,  verify_shape = False)
#以下为特定值的constant类型
tf.zeros( shape ,  dtype = tf . float32 ,  name = None)
tf.zeros_like( input_tensor ,  dtype = None ,  name = None ,  optimize = True)
# create a tensor of shape and type (unless type is specified) as the input_tensor but all elements are zeros.
tf.ones( shape ,  dtype = tf . float32 ,  name = None)
tf.ones_like( input_tensor ,  dtype = None ,  name = None ,  optimize = True)
tf.fill( dims ,  value ,  name = None )
# create a tensor filled with a scalar value.

tf.linspace( start ,  stop ,  num ,  name = None)
# create a sequence of  num evenly-spaced values are generated beginning at  start. If num > 1, the values in the sequence increase by stop - start / num - 1, so that the last one is exactly  stop.
# start, stop, num must be scalars
# comparable to but slightly different from numpy.linspace
#numpy.linspace输出可以非整数
# numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)
tf.linspace( 10.0 ,   13.0 ,   4 ,  name = "linspace" )   ==>   [ 10.0   11.0   12.0   13.0]

tf.range( start ,  limit = None ,  delta = 1 ,  dtype = None ,  name = 'range')
#不包括limit
# create a sequence of numbers that begins at start and extends by increments of delta up to but not including limit
# slight different from range in Python
# 'start' is 3, 'limit' is 18, 'delta' is 3
tf.range( start ,  limit ,  delta )   ==>   [ 3 ,   6 ,   9 ,   12 ,   15] # 'start' is 3, 'limit' is 1, 'delta' is -0.5
tf.range( start ,  limit ,  delta )   ==>   [ 3 ,   2.5 ,   2 ,   1.5]
# 'limit' is 5
tf.range( limit )   ==>   [ 0 ,   1 ,   2 ,   3 ,   4]

和np不同tf的序列不允许用于循环

for  _  in  np.linspace( 0 ,   10 ,   4 ):   # OK
for  _  in  tf.linspace( 0 ,   10 ,   4 ):   # TypeError("'Tensor' object is not iterable.")
for  _  in  range( 4 ):   # OK
for  _  in  tf.range( 4 ):   # TypeError("'Tensor' object is not iterable.")

生成特定分布的随机数

tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None) tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

tf.random_uniform(shape, minval=0, maxval=None, dtype=tf.float32, seed=None, name=None)

tf.random_shuffle(value, seed=None, name=None)

tf.random_crop(value, size, seed=None, name=None)

tf.multinomial(logits, num_samples, seed=None, name=None) 

tf.random_gamma(shape, alpha, beta=None, dtype=tf.float32, seed=None, name=None)

3. 数学操作

访问tf官方api文档了解更多

a = tf.constant([3, 6])
b = tf.constant([2, 2])
tf.add(a, b) # >> [5 8]
tf.add_n([a, b, b]) # >> [7 10]. Equivalent to a + b + b
tf.mul(a, b) # >> [6 12] because mul is element wise
tf.matmul(a, b) # >> ValueError
tf.matmul(tf.reshape(a, shape=[1, 2]), tf.reshape(b, shape=[2, 1])) # >> [[18]] tf.div(a, b) # >> [1 3]
tf.mod(a, b) # >> [1 0]

py的操作列表，来源：Fundamentals of Deep Learning 的作者

种类	例子
类型拓广的数学操作	add,sub,div..
数组操作	contact，slice，split..
矩阵操作	matmul,matrixInverse(矩阵反转),matrixDeterminant(矩阵行列式)..
状态操作	variable,assign.assignAdd…
神经网络建立块	softMax,sigMod..
checkpoint操作	save,restore
队列和同步操作	enqueue,dequeue,mutexAcquire(互斥锁),mutexRelease(锁释放)..
控制流操作	merge,switch,enter,leave,nextIteration

4. 数据类型

t_0  =   19   # Treated as a 0-d tensor, or "scalar" tf . zeros_like ( t_0 )   # ==> 0
tf.ones_like ( t_0 )   # ==> 1
t_1  =   [ b "apple" ,  b "peach" ,  b "grape" ]   # treated as a 1-d tensor, or "vector" tf . zeros_like ( t_1 )   # ==> ['' '' '']
tf.ones_like ( t_1 )   # ==> TypeError: Expected string, got 1 of type 'int' instead.
t_2 =  [[ True ,   False ,   False ],  [ False ,   False ,   True ],[ False ,   True ,   False ]]   # treated as a 2-d tensor, or "matrix"
tf.zeros_like ( t_2 )   # ==> 2x2 tensor, all elements are False tf.ones_like ( t_2 )   # ==> 2x2 tensor, all elements are True

tf自身数据类型：
这里写图片描述

官方文档

事实上，tf数据类型基于np，比如np.int32 == tf.int32 返回为True，你这样也是可以的：
tf.ones ([ 2 , 2 ], np.float32 ) ==> [[ 1.0 1.0 ], [ 1.0 1.0 ]]
tf.Session.run(fetches) 就是接受tensor输出np.array，大部分情况tensor和np.array是可以互换的

string类型有一个例外，数字和布尔类型tf和np可以匹配的很好，然而由于np管理数组的方式，tf.string没有一个确切的np匹配，但是你依然可以将np的string数组导入到tf.string中，只要你不要在np中声明一个确切的dtype
np和tf都是多维数组库，np提供ndarrar，但是不支持创建张量函数和自动计算导数，也不支持GPU运算，所以tf更胜一筹
使用py类型声明rf对象迅速而且简单，对于示例代码这是一个好主意。然而这样做有一个重要的陷阱，py类型缺少明确规定数据类型的的能力，tf类型更为具体，比如所有的整型都是相同类型，但是tf有8-bit，16-bit等整型可用，因此，如果你使用py类型，tf不得不推导你的数据具体类型

总之，尽可能使用tf类型数据

5. 变量

constant就是constant，variable可以被赋值和改变
constant值被存储在graph中，然而variable被分开存储，并且可能存活在参数服务器(parameter server)上,所以constant是内存expensive的，每次加载的时候都去检查graph定义和存储，简单的输出原始缓存。

import tensorflow as tf
my_const = tf.constant([1.0, 2.0], name="my_const") print tf.get_default_graph().as_graph_def()

输出

node {
name: "my_const" op: "Const"
attr {
key: "dtype" value {
type: DT_FLOAT }
}
attr {
key: "value" value {
tensor {
dtype: DT_FLOAT tensor_shape {
dim { size: 2
} }
tensor_content: "\000\000\200?\000\000\000@" }
} }
}
versions {
producer: 17 }

tf.constant 是个操作（节点）tf.variable 是个类

#create variable c as a 2x2 matrix
c  =  tf.Variable([[ 0 ,   1 ],   [ 2 ,   3 ]],  name = "matrix" )
# create variable W as 784 x 10 tensor, filled with zeros
W  =  tf.Variable( tf . zeros ([ 784 , 10 ]))

variable可以有多个ops：

x = tf.Variable(...)
x.initializer # init x.value() # read op x.assign(...) # write op x.assign_add(...)
# and more

variable使用前必须初始化

初始化所有变量最简单的方法：

init  =  tf . global_variables_initializer ()
with  tf . Session ()   as  sess: tf . run ( init)

tf.run()初始化不喂给变量任何值，tf.variables_initializer() 初始化部分变量，形参为初始化变量列表：

init_ab  =  tf.variables_initializer([ a ,  b ],  name = "init_ab") with  tf.Session()   as  sess:
tf.run( init_ab)

分开初始化每个变量使用tf.Variable.initializer：

# create variable W as 784 x 10 tensor, filled with zeros
W  =  tf.Variable( tf . zeros ([ 784 , 10 ])) with  tf . Session ()   as  sess:
tf.run( W . initializer)

print变量：

# W is a random 700 x 100 variable object
W  =  tf.Variable( tf . truncated_normal ([ 700 ,   10 ])) with  tf.Session()   as  sess:
sess.run( W . initializer) print  W
>>   Tensor ( "Variable/read:0" ,  shape =( 700 ,   10 ),  dtype = float32)

print 变量的eval()方法：

# W is a random 700 x 100 variable object
W  =  tf.Variable( tf . truncated_normal ([ 700 ,   10 ]))
with  tf . Session ()   as  sess:
sess.run( W . initializer) print  W.eval ()
>>   [[- 0.76781619   - 0.67020458   1.15333688   ...,   - 0.98434633   - 1.25692499  - 0.90904623]
 [- 0.36763489   - 0.65037876   - 1.52936983   ...,   0.19320194   - 0.38379928  0.44387451]
 [   0.12510735   - 0.82649058   0.4321366   ...,   - 0.3816964   0.70466036  1.33211911]
 ...,
 [   0.9203397   - 0.99590844   0.76853162   ...,   - 0.74290705   0.37568584
 0.64072722]
 [- 0.12753558   0.52571583   1.03265858   ...,   0.59978199   - 0.91293705
 - 0.02646019]
 [   0.19076447   - 0.62968266   - 1.97970271   ...,   - 1.48389161   0.68170643
 1.46369624 ]]

给变量赋值：

W  =  tf.Variable( 10 ) W . assign ( 100 )
with  tf.Session()   as  sess:
sess.run( W.initializer) print  W.eval () # >> 10

详情见这里

W  =  tf.Variable( 10 ) 
assign_op = W.assign ( 100 ) with  tf.Session ()   as  sess:
sess.run( assign_op) print  W.eval () # >> 100

操作不初始化w，而直接赋值给变量本身。

# in the  source code
self._initializer_op = state_ops.assign(self._variable, self._initial_value,
validate_shape=validate_shape).op

一个有趣的例子：

# create a variable whose original value is 2
a  =  tf.Variable( 2 ,  name = "scalar" )
# assign a * 2 to a and call that op a_times_two
a_times_two  =  a.assign( a  *   2)
init  =  tf . global_variables_initializer ()
with  tf.Session()   as  sess: 
sess.run( init)
# have to initialize a, because a_times_two op depends on the value of a 
sess.run( a_times_two )   # >> 4
sess.run( a_times_two )   # >> 8
sess.run( a_times_two )   # >> 16

另外还有.assign_add() .assign_sub()方法，不直接初始化变量而依赖于初始化的值：

W = tf.Variable(10)
with tf.Session() as sess: sess.run(W.initializer)
print sess.run(W.assign_add(10)) # >> 20 print sess.run(W.assign_sub(2)) # >> 18

由于tf的variable值分开存储，每个session都会有各自独立的variable值：

W = tf.Variable(10)
sess1 = tf.Session() sess2 = tf.Session()
sess1.run(W.initializer) sess2.run(W.initializer)
print sess1.run(W.assign_add(10)) # >> 20 print sess2.run(W.assign_sub(2)) # >> 8
print sess1.run(W.assign_add(100)) # >> 120 print sess2.run(W.assign_sub(50)) # >> -42
sess1.close() sess2.close()

你也可以声明一个依赖于其他变量的变量：

# W is a random 700 x 100 tensor
W  =  tf.Variable( tf.truncated_normal ([ 700 ,   10 ])) 
U  =  tf.Variable( W  *   2)

请确保W被使用之前使用initialized_value()初始化

6. InteractiveSession

IteractiveSession和session的区别是IteractiveSession会把自己作为默认Session，使用run()等方法时不用指定调用特定session，在交互式shell或者ipython notebooks中是很方便的，但是在需要多个session的场景下会是易混淆的

sess  =  tf.InteractiveSession() 
a  =  tf.constant( 5.0)
b  =  tf.constant( 6.0)
c  =  a  *  b
# We can just use 'c.eval()' without passing 'sess'
print( c.eval ()) sess . close ()
#tf.InteractiveSession.close() closes an InteractiveSession.

7. 控制依赖

使用tf.Graph.control_dependencies(control_inputs)控制多个节点顺序：

# your graph g have 5 ops: a, b, c, d, e 
with  g.control_dependencies([ a ,  b ,  c ]):
 # `d` and `e` will only run after `a`, `b`, and `c` have executed. 
 d  =   ...
 e  =  ...

8. 占位符和feed_dict

通过占位符建立图，在运行时使用feed_dict喂给数据：

定义一个占位符：

tf.placeholder( dtype ,  shape = None ,  name = None)

避免使用shape=None建立图，虽然这很方便，但是在debug时将会是异常艰难的。并且尽量给ops命名。
详情见文档

# create a placeholder of type float 32-bit, shape is a vector of 3 elements
a  =  tf.placeholder( tf . float32 ,  shape =[ 3 ])
# create a constant of type float 32-bit, shape is a vector of 3 elements
b  =  tf.constant([ 5 ,   5 ,   5 ],  tf.float32)
# use the placeholder as you would a constant or a variable
c  =  a  +  b  # Short for tf.add(a, b)
If  we  try  to fetch c ,  we will run  into  error.
with  tf.Session()   as  sess: 
print( sess.run( c ))
>>   NameError

因为a，b只是占位符，使用前必须喂给确定的值：

with  tf.Session()   as  sess:
# feed [1, 2, 3] to placeholder a via the dict {a: [1, 2, 3]} # fetch value of c
print( sess.run( c ,   { a :  [ 1 ,   2 ,   3 ]}))
>>   [ 6.   7.   8.]

使用循环给占位符多次赋值：

with  tf.Session()   as  sess:
for  a_value  in  list_of_a_values: 
  print( sess.run( c ,   { a : a_value }))

你也可以给不是占位符的张量喂值，检查一个张量是否是可以被喂值得：

tf.Graph.is_feedable(tensor)

# create Operations, Tensors, etc (using the default graph) a = tf.add(2, 5)
b = tf.mul(a, 3)
# start up a `Session` using the default graph sess = tf.Session()
# define a dictionary that says to replace the value of `a` with 15 replace_dict = {a: 15}
# Run the session, passing in `replace_dict` as the value to `feed_dict` sess.run(b, feed_dict=replace_dict) # returns 45

9. 延迟加载的陷阱

延时加载是指，推迟变量的声明或者定义直到加载它的时候。
正常载入，通过建立z节点：

x = tf.Variable(10, name='x') y = tf.Variable(20, name='y') z = tf.add(x, y)
with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for _ in range(10):
sess.run(z) writer.close()

使用更灵活的延时加载：

x = tf.Variable(10, name='x') y = tf.Variable(20, name='y')
with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for _ in range(10):
sess.run(tf.add(x, y)) # create the op add only when you need to compute it writer.close()

下面是正常加载时的tensorboard图：
这里写图片描述

下面是使用延时加载的tensorboard图：
这里写图片描述

add()消失了，因为我们在建立图之后运行时才添加的节点add()，这会是难以阅读的，但并不是一个bug

打印图定义：

print tf.get_default_graph().as_graph_def()

图的protobuf显示正常载入的图只有一个add节点

node {
name: "Add" op: "Add" input: "x/read" input: "y/read" attr {
key: "T" value {
type: DT_INT32 }
} }

然而延时加载的图中有十个add节点，每次迭代计算z时都会建立一个新的节点：

node {
name: "Add" op: "Add"
...
} node {
name: "Add_9" op: "Add"
...
}

在你训练神经网络时可能会迭代上千次，导致建立上千个节点，使你的图变得庞大和难以加载，有两个方法可以避免这个bug，一种是避免使用延时加载，但是当你把相关节点组合分类的时候是没办法避免的，所以你你可利用python的特性确保你的函数在第一次调用的时候只加载一次，具体做法参考Danijar Hafner的博文
CS

wangyuweihx

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
[翻译]斯坦福CS 20SI:基于Tensorflow的深度学习研究课程笔记,Lecture note 2: TensorFlow Ops

1. tensorboard使用示例代码：import tensorflow as tf a = tf.constant ( 2)b = tf . constant ( 3)x = tf . add ( a , b)with tf.Session () as sess: print sess.run ( x)训练之前，建立graph以后，运行下列代码激活tenso
复制链接

扫一扫