tensorflow1.x第二篇

最新推荐文章于 2024-09-11 20:58:49 发布

敲出美好未来

最新推荐文章于 2024-09-11 20:58:49 发布

阅读量162

点赞数

分类专栏： TensorFlow----1.x 文章标签： tensorflow

本文链接：https://blog.csdn.net/Oliverzoo/article/details/105674511

版权

TensorFlow----1.x 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

激活函数，均方误差 mse，学习率 learning_rate设计策略，滑动平均，正则化，模块化编程。

激活函数：引入非线性激活因素，提高模型的表达力。激活函数 sigmoid：在 Tensorflow 中，用 tf.nn.sigmoid()，包装过后恒正所以引出下一个激活函数，激活函数 tanh：在 Tensorflow 中，用 tf.nn.tanh()，梯度弥散（在梯度过大时），所以激活函数 relu: 在 Tensorflow 中，用 tf.nn.relu()
均方误差 mse：n 个样本的预测值 y 与已知答案 y_之差的平方和，再求平均值在 Tensorflow 中用 loss_mse = tf.reduce_mean(tf.square(y_ - y))

#0导入模块，生成数据集
import tensorflow as tf
import numpy as np
BATCH_SIZE = 8

X = np.random.rand(32,2)
Y_ = [[x1+x2+np.random.rand()/10.-0.05] for (x1,x2) in X]
#1定义神经网络的输入、参数和输出，定义前向传播过程。
x = tf.placeholder(tf.float32,shape=[None,2])
y_ = tf.placeholder(tf.float32,shape=[None,1])
w1 = tf.Variable(tf.random_normal([2,1],stddev=1))
y = tf.matmul(x,w1)
#2定义损失函数及反向传播方法。
#定义损失函数为MSE,反向传播方法为梯度下降。
loss_mse = tf.reduce_mean(tf.square(y-y_))
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss_mse)
#3生成会话，训练STEPS轮
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(20000):
        start = (i*BATCH_SIZE)%32
        end = start+BATCH_SIZE
        loss_mse_dis,_ = sess.run([loss_mse,train_step],feed_dict={x:X[start:end],y_:Y_[start:end]})
        if i %2000 == 0:
            print('After {}steps loss:{}'.format(i,loss_mse_dis))
    print('w1:',w1)
    print('w1:', sess.run(w1))
out：
After 0steps loss:0.22664415836334229
After 2000steps loss:0.03838531672954559
After 4000steps loss:0.01794697903096676
After 6000steps loss:0.009312132373452187
After 8000steps loss:0.005021141842007637
After 10000steps loss:0.0028489811811596155
After 12000steps loss:0.001746544730849564
After 14000steps loss:0.0011868053115904331
After 16000steps loss:0.0009025642066262662
After 18000steps loss:0.0007582578109577298
w1: <tf.Variable 'Variable:0' shape=(2, 1) dtype=float32_ref>
w1: [[0.9663933]
 [1.0228527]]

Process finished with exit code 0

学习率 learning_rate：表示了每次参数更新的幅度大小。学习率过大，会导致待优化的参数在最
小值附近波动，不收敛；学习率过小，会导致待优化的参数收敛缓慢。给出学习率设置的策略：指数衰减学习率：学习率随着训练轮数变化而动态更新。Learning_rate=LEARNING_RATE_BASELEARNING_RATE_DECAYLEARNING_RATE_STEP（更新学习率的频率，每隔多少轮批次进行更新）
global_step = tf.Variable(0, trainable=False)
learning_rate = tf.train.exponential_decay(
LEARNING_RATE_BASE,
global_step,
LEARNING_RATE_STEP, LEARNING_RATE_DECAY,
staircase=True/False)
其中，LEARNING_RATE_BASE 为学习率初始值，LEARNING_RATE_DECAY 为学习率衰减率,global_step 记录了当前训练轮数，为不可训练型参数。学习率 learning_rate 更新频率为输入数据集总样本数除以每次喂入样本数。若 staircase 设置为 True 时，表示 global_step/learning rate step 取整数，学习率阶梯型衰减；若 staircase 设置为 false 时，学习率会是一条平滑下降的曲线。

import tensorflow as tf
LEARNING_RATE_BASE = 0.1
LEARNING_RATE_DECAY = 0.99
LEARNING_RATE_STEP = 1

global_step = tf.Variable(0,trainable=False)
learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,
                                           global_step,
                                           LEARNING_RATE_STEP,
                                           LEARNING_RATE_DECAY,
                                           staircase=True)
w = tf.Variable(tf.constant(5.,tf.float32))
loss = tf.square(w+1)
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(40):
        w_dis,learning_rate_dis,loss_dis,global_step_dis,_ = sess.run([w,learning_rate,loss,global_step,train_step])
        print ("After %s steps: global_step is %f, w is %f, learning rate is %f, loss is %f" % (i, global_step_dis, w_dis, learning_rate_dis, loss_dis))
out：
After 0 steps: global_step is 1.000000, w is 5.000000, learning rate is 0.100000, loss is 36.000000
After 1 steps: global_step is 2.000000, w is 3.800000, learning rate is 0.099000, loss is 23.040001
After 2 steps: global_step is 3.000000, w is 2.849600, learning rate is 0.098010, loss is 14.819419
After 3 steps: global_step is 4.000000, w is 2.095001, learning rate is 0.097030, loss is 9.579033
After 4 steps: global_step is 5.000000, w is 1.494386, learning rate is 0.096060, loss is 6.221960
After 5 steps: global_step is 6.000000, w is 1.015166, learning rate is 0.095099, loss is 4.060895
After 6 steps: global_step is 7.000000, w is 0.631886, learning rate is 0.094148, loss is 2.663051
After 7 steps: global_step is 8.000000, w is 0.324608, learning rate is 0.093207, loss is 1.754587
After 8 steps: global_step is 9.000000, w is 0.077684, learning rate is 0.092274, loss is 1.161402
After 9 steps: global_step is 10.000000, w is -0.121202, learning rate is 0.091352, loss is 0.772287
After 10 steps: global_step is 11.000000, w is -0.281761, learning rate is 0.090438, loss is 0.515867
After 11 steps: global_step is 12.000000, w is -0.411674, learning rate is 0.089534, loss is 0.346128

滑动平均：记录了一段时间内模型中所有参数 w 和 b 各自的平均值。利用滑动平均值可以增强模型的泛化能力。
ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY，global_step)
其中，MOVING_AVERAGE_DECAY 表示滑动平均衰减率，一般会赋接近 1 的值，global_step 表示当前
训练了多少轮。 √ema_op = ema.apply(tf.trainable_variables())
其中，ema.apply()函数实现对括号内参数求滑动平均，tf.trainable_variables()函数实现把所有
待训练参数汇总为列表。
√with tf.control_dependencies([train_step, ema_op]): train_op = tf.no_op(name=‘train’)
其中，该函数实现将滑动平均和训练过程同步运行。
查看模型中参数的平均值，可以用 ema.average()函数。
影子 = 衰减率 * 影子 +（1 - 衰减率）* 参数
将 MOVING_AVERAGE_DECAY 设置为 0.99，参数 w1 设置为 0，w1 的滑动平均值设
置为 0，开始时，轮数 global_step 设置为 0，参数 w1 更新为 1，则 w1 的滑动平均值为：
w1 滑动平均值=min(0.99,1/10)*0+(1– min(0.99,1/10)*1 = 0.9

import tensorflow as tf

w1 = tf.Variable(0,dtype=tf.float32)
global_step = tf.Variable(0,trainable=False)
MOVING_AVERAGE_DECAY = 0.99
ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step)
ema_op = ema.apply(tf.trainable_variables())

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    print("current global_step:", sess.run(global_step))
    print("current w1", sess.run([w1, ema.average(w1)]) )

    sess.run(tf.assign(w1, 1))
    sess.run(ema_op)
    print("current global_step:", sess.run(global_step))
    print("current w1", sess.run([w1, ema.average(w1)]))

    sess.run(tf.assign(w1, 10))
    sess.run(ema_op)
    print("current global_step:", sess.run(global_step))
    print("current w1", sess.run([w1, ema.average(w1)]))
out：
current global_step: 0
current w1 [0.0, 0.0]
current global_step: 0
current w1 [1.0, 0.9]
current global_step: 0
current w1 [10.0, 9.09]

Process finished with exit code 0

正则化：在损失函数中给每个参数 w 加上权重，引入模型复杂度指标，从而抑制模型噪声，减小
过拟合。loss = loss(y 与 y_) + REGULARIZER*loss(w)
L1 正则化： 𝒍𝒐𝒔𝒔𝑳𝟏 = ∑𝒊|𝒘𝒊| 用 Tesnsorflow 函数表示:loss(w) = tf.contrib.layers.l1_regularizer(REGULARIZER)(w)
L2 正则化： 𝒍𝒐𝒔𝒔𝑳𝟐 = ∑𝒊|𝒘𝒊|^2用 Tesnsorflow 函数表示:loss(w) = tf.contrib.layers.l2_regularizer(REGULARIZER)(w)
tf.add_to_collection(‘losses’, tf.contrib.layers.l2_regularizer(regularizer)(w)
loss = cem + tf.add_n(tf.get_collection(‘losses’))

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
BATCH_SIZE = 30

X = np.random.randn(300,2)
Y_ = [int(x0**2+x1**2<2) for (x0,x1) in X]
Y_c = [['red' if y else 'blue'] for y in Y_]
X = np.vstack(X).reshape(-1,2)
Y_ = np.vstack(Y_).reshape(-1,1)
plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
plt.show()

def get_weight(shape,regularizer):
    w = tf.Variable(tf.random_normal(shape),dtype = tf.float32)
    tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w
def get_bias(shape):
    b = tf.Variable(tf.constant(0.01,shape=shape))
    return b

x = tf.placeholder(tf.float32,shape=[None,2])
y_ = tf.placeholder(tf.float32,shape=[None,1])
w1 = get_weight([2,11],0.01)
b1 = get_bias([11])
y1 = tf.nn.relu(tf.matmul(x,w1)+b1)

w2 = get_weight([11,1],0.01)
b2 = get_bias([1])
y = tf.matmul(y1,w2)+b2

loss_mse = tf.reduce_mean(tf.square(y-y_))
loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))

train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_mse)
with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    STEPS = 20000
    for i in range(STEPS):
        start = (i*BATCH_SIZE) % 300
        end = start + BATCH_SIZE
        sess.run(train_step, feed_dict={x:X[start:end], y_:Y_[start:end]})
        if i % 2000 == 0:
            loss_mse_v = sess.run(loss_mse, feed_dict={x:X, y_:Y_})
            print("After %d steps, loss is: %f" %(i, loss_mse_v))


    xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
    grid = np.c_[xx.ravel(), yy.ravel()]
    probs = sess.run(y, feed_dict={x: grid})
    probs = probs.reshape(xx.shape)
plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
plt.contour(xx, yy, probs, levels=[.5])
plt.show()
#采用正则化
train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_total)

with tf.Session() as sess:
	init_op = tf.global_variables_initializer()
	sess.run(init_op)
	STEPS = 20000
	for i in range(STEPS):
		start = (i*BATCH_SIZE) % 300
		end = start + BATCH_SIZE
		sess.run(train_step, feed_dict={x: X[start:end], y_:Y_[start:end]})
		if i % 2000 == 0:
			loss_v = sess.run(loss_total, feed_dict={x:X,y_:Y_})
			print("After %d steps, loss is: %f" %(i, loss_v))

	xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
	grid = np.c_[xx.ravel(), yy.ravel()]
	probs = sess.run(y, feed_dict={x:grid})
	probs = probs.reshape(xx.shape)
plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
plt.contour(xx, yy, probs, levels=[.5])
plt.show()

在这里插入图片描述

模块化编程：1.generateds 2.forward 3. backward

#generateds
import numpy as np
seed = 2
def generateds():
    # 基于seed产生随机数
    rdm = np.random.RandomState(seed)
    # 随机数返回300行2列的矩阵，表示300组坐标点（x0,x1）作为输入数据集
    X = rdm.randn(300, 2)
    # 从X这个300行2列的矩阵中取出一行,判断如果两个坐标的平方和小于2，给Y赋值1，其余赋值0
    # 作为输入数据集的标签（正确答案）
    Y_ = [int(x0 * x0 + x1 * x1 < 2) for (x0, x1) in X]
    # 对数据集X和标签Y进行形状整理，第一个元素为-1表示跟随第二列计算，第二个元素表示多少列，可见X为两列，Y为1列
    X = np.vstack(X).reshape(-1, 2)
    Y_ = np.vstack(Y_).reshape(-1, 1)

    return X, Y_

#forward
#0导入模块 ，生成模拟数据集
import tensorflow as tf


# 定义神经网络的输入、参数和输出，定义前向传播过程 
def get_weight(shape, regularizer):
    w = tf.Variable(tf.random_normal(shape), dtype=tf.float32)
    tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w


def get_bias(shape):
    b = tf.Variable(tf.constant(0.01, shape=shape))
    return b


def forward(x, regularizer):
    w1 = get_weight([2, 11], regularizer)
    b1 = get_bias([11])
    y1 = tf.nn.relu(tf.matmul(x, w1) + b1)

    w2 = get_weight([11, 1], regularizer)
    b2 = get_bias([1])
    y = tf.matmul(y1, w2) + b2

    return y

#backward
#0导入模块 ，生成模拟数据集
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import forward
import generateds

STEPS = 40000
BATCH_SIZE = 30
LEARNING_RATE_BASE = 0.001
LEARNING_RATE_DECAY = 0.999
REGULARIZER = 0.01


def backward():
    x = tf.placeholder(tf.float32, shape=(None, 2))
    y_ = tf.placeholder(tf.float32, shape=(None, 1))

    X, Y_= generateds.generateds()

    y = forward.forward(x, REGULARIZER)

    global_step = tf.Variable(0, trainable=False)

    learning_rate = tf.train.exponential_decay(
        LEARNING_RATE_BASE,
        global_step,
        300 / BATCH_SIZE,
        LEARNING_RATE_DECAY,
        staircase=True)

    # 定义损失函数
    loss_mse = tf.reduce_mean(tf.square(y - y_))
    loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))

    # 定义反向传播方法：包含正则化
    train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss_total)

    with tf.Session() as sess:
        init_op = tf.global_variables_initializer()
        sess.run(init_op)
        for i in range(STEPS):
            start = (i * BATCH_SIZE) % 300
            end = start + BATCH_SIZE
            sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
            if i % 2000 == 0:
                loss_v = sess.run(loss_total, feed_dict={x: X, y_: Y_})
                print("After %d steps, loss is: %f" % (i, loss_v))
if __name__ == '__main__':
    backward()

After 30000 steps, loss is: 0.090862
After 32000 steps, loss is: 0.090853
After 34000 steps, loss is: 0.090845
After 36000 steps, loss is: 0.090838
After 38000 steps, loss is: 0.090823