环境
- Tensorflow
- Ubuntu18.04
- Python3
# 设置tensorflow
import tensorflow as tf
1 初始化函数
【参数列表】
序号 | 函数 | 参数 | 描述 |
---|---|---|---|
1 | tf.constant_initializer | value,变量类型为scalar,list,tuple或n维numpy array初始的变量全设定为value | 将变量初始化为给定的值 |
2 | tf.random_normal_initializer | mean均值,stddev方差 | 将变量初始化为满足正态分布的随机值 |
3 | tf.truncated_normal_initializer | mean均值,stddev方差 | 将变量初始化为满足正太分布的随机值,如果随机的值偏离平均值超过2个标准,将重新随机 |
4 | tf.random_uniform_initializer | min最小值,max最大值 | 将变量初始化为满足平均分布的随机值 |
5 | tf.uniform_unit_scaling_initializer | factor产生随机值是相乘的系数 | 将变量初始化为满足平均分布但不影响输出数量级的随机值 |
6 | tf.zeros_initializer | shape变量维度 | 变量设置为全0 |
7 | tf.ones_initializer | shape变量维度 | 变量设置为全1 |
2 定义变量及获取
2.1 get_variable
with tf.variable_scope("Input_3"):
v1 = tf.get_variable("v1", [1], initializer=tf.constant_initializer(1))
v2 = tf.Variable([1], name='v2')
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
print("v1 name is: {}".format(v1.name))
print("v1 value is: {}".format(sess.run(v1)))
# 结果
# get_variable变量名必填v1
# 可设定数据维度shape
# 初始值通过initializer设定
v1 name is: Input_3/v1:0
v1 value is: [1.]
2.2 Variable
with tf.variable_scope("Input_3"):
v2 = tf.Variable([1], name='v2')
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
print("v2 name is: {}".format(v2.name))
print("v2 value is: {}".format(sess.run(v2)))
# 结果
# 变量名可选填,通过name设定
# 可直接赋值及设定维度
v2 name is: Input_3/v2:0
v2 value is: [1]
3 变量管理
3.1 variable_scope(“name”)
with tf.Session() as sess:
with tf.variable_scope("Input_1"):
v1 = tf.constant([1.0], shape=[1], name='v1')
print("V1 name is: {}".format(v1.name))
print("V1 value is: {}".format(sess.run(v1)))
# 结果
# 变量v1在Input_1的命名空间中
V1 name is: Input/v1:0
V1 value is: [1.]
3.2 name_scope(“name”)
import tensorflow as tf
with tf.Session() as sess:
with tf.variable_scope("Input_1"):
v1 = tf.constant([1.0], shape=[1], name='v1')
print("V1 name is: {}".format(v1.name))
print("V1 value is: {}".format(sess.run(v1)))
with tf.name_scope("Input_2"):
v1 = tf.constant([1.0], shape=[1], name='v1')
print("v1 name is: {}".format(v1.name))
print("v1 value is: {}".format(sess.run(v1)))
# 结果
# v1在命名空间Input_2中
# 结果可知,命名空间具有管理与隔离变量的作用
V1 name is: Input_1/v1:0
V1 value is: [1.]
v1 name is: Input_2/v1:0
v1 value is: [1.]
4 正则化
功能:防止过拟合.
4.1 contrib.layers.l1_regularizer(lambda)(w)
【参数列表】
序号 | 参数 | 描述 |
---|---|---|
1 | lambda | 系数 |
2 | w | 权重 |
- 计算方程:
L 1 = λ ∗ ∑ i ∣ w i ∣ L1=\lambda*\sum_{i}|w_i| L1=λ∗i∑∣wi∣
4.2 contrib.layers.l2_regularizer(lambda)(w)
【参数列表】
序号 | 参数 | 描述 |
---|---|---|
1 | lambda | 系数 |
2 | w | 权重 |
- 计算方程:
L 2 = λ ∗ ( ∑ i ∣ w i ∣ 2 ) / 2 L2=\lambda*(\sum_{i}|w_i|^2)/2 L2=λ∗(i∑∣wi∣2)/2
4.3 使用
w = tf.constant([[1.0, -1.0], [-2.0, 2.0]])
with tf.Session() as sess:
L1 = tf.contrib.layers.l1_regularizer(0.2)(w)
L2 = tf.contrib.layers.l2_regularizer(0.2)(w)
print("L1 regularizer result: {}".format(sess.run(L1)))
print("L2 regularizer result: {}".format(sess.run(L2)))
L1: (|1|+|-1|+|-2|+|2|)*0.2=1.2
L1 regularizer result: 1.2000000476837158
Tensorflow将L2正则化损失值除以2,使求导的结果更加简洁.
L2:[(
∣
1
∣
2
|1|^2
∣1∣2+
∣
−
1
∣
2
|-1|^2
∣−1∣2+
∣
−
2
∣
2
|-2|^2
∣−2∣2+
∣
2
∣
2
|2|^2
∣2∣2)/2]*0.2=1
L2 regularizer result: 1.0
5 卷积计算
'''卷积核3x3,当前图像深度:3,卷积后图像深度:16'''
weights = tf.get_variable('weights', [3, 3, 3, 16], initializer=tf.truncated_normal_initializer(stddev=0.1))
'''偏置:16'''
biases = tf.get_variable('biases', [16], initializer=tf.constant_initializer(0.1))
'''卷积计算:移动步长为1'''
conv = tf.nn.conv2d(input, weights, strids=[1,1,1,1], padding="SAME")
'''结果+偏置'''
bias = tf.nn.bias_add(conv, biases)
'''非线性计算'''
relu = tf.nn.relu(bias)
6 滑动平均模型
功能:提高迁移健壮性,提高模型在测试数据集上的处理能力.
6.1 简介
初始化ExponentialMovingAverage是,需要提供衰减率decay,衰减率用于控制模型更新速度,ExponentialMovingAverage对每一个变量维护一个影子变量shadow variable,影子变量初始值就是相应变量的初始值,每次运行更新时,影子变量的值会更新为:
s
h
a
d
o
w
V
a
r
i
a
b
l
e
=
d
e
c
a
y
∗
s
h
a
d
o
w
V
a
i
r
a
b
l
e
+
(
1
−
d
e
c
a
y
)
∗
v
a
r
i
a
b
l
e
shadowVariable=decay*shadowVairable+(1-decay)*variable
shadowVariable=decay∗shadowVairable+(1−decay)∗variable
为加快模型在训练前期的更新速度,ExponentialMovingAverage提供了numpy_updates参数动态设置decay,计算方法:
d
e
c
a
y
=
m
i
n
{
d
e
c
a
y
,
1
+
n
u
m
U
p
d
a
t
e
s
10
+
n
u
m
U
p
d
a
t
e
s
}
decay =min\lbrace{decay,\frac {1+numUpdates}{10+numUpdates}}\rbrace
decay=min{decay,10+numUpdates1+numUpdates}
【参数列表】
序号 | 参数 | 描述 |
---|---|---|
1 | shadowVariable | 影子变量 |
2 | variable | 待更新的变量 |
3 | decay | 衰减率,一般设成接近1的数字,如0.999,0.9999 |
6.2 train.ExponentialMovingAverage
v1 = tf.Variable(0, dtype=tf.float32, name='v')
step = tf.Variable(0, trainable=False)
# 初始衰减率0.99, 控制衰减率变量step
ema = tf.train.ExponentialMovingAverage(0.99, step)
# 更新moving average
maintain_averages_op = ema.apply([v1])
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
# v1=0, 初始化shadow=0
v1_1, shadow_1 = sess.run([v1, ema.average(v1)])
print("v1 value is: {}, shadow variable value is: {}".format(v1_1, shadow_1))
# v1=5
sess.run(tf.assign(v1, 5))
# 更新shadow
sess.run(maintain_averages_op)
# v1_2=5, shadow_2=decay*shadow + (1-decay)*variable
# shadow=0,step-=0,decay=min{0.99,(1+0)/(10+0)}=0.1
# shadow_2=0.1*0 + (1-0.1)*5=4.5
v1_2, shadow_2 = sess.run([v1, ema.average(v1)])
print("v1 value is: {}, shadow variable is: {}".format(v1_2, shadow_2))
# 更新numUpdates=100
sess.run(tf.assign(step, 100))
# 更新v1=8
sess.run(tf.assign(v1, 8))
# 更新shadow=4.5
sess.run(maintain_averages_op)
# v1_3=8, step=100, decay=min{0.99,(1+100)/(10+100)}=0.9182
# shadow_3=decay*shadow_2+(1-decay)*variable
# shadow_3=0.9182*4.5 + (1-0.9182)*8=4.7863
v1_3, step, shadow_3 = sess.run([v1, step, ema.average(v1)])
print("v1 value is: {}, step value is: {}, shadow variable value is: {}".format(v1_3, step, shadow_3))
# v1=0, decay= shadow=
v1 value is: 0.0, shadow variable value is: 0.0
v1 value is: 5.0, shadow variable is: 4.5
v1 value is: 8.0, step value is: 100, shadow variable value is: 4.78636360168
7 总结
- 变量命名空间:隔离变量,可视化分组;
- 正则化:避免过拟合;
- 滑动平均模型:提高训练模型在测试数据集上的预测能力;