tensorflow1.x第二篇

激活函数,均方误差 mse,学习率 learning_rate设计策略,滑动平均,正则化,模块化编程。

  • 激活函数:引入非线性激活因素,提高模型的表达力。激活函数 sigmoid:在 Tensorflow 中,用 tf.nn.sigmoid(),包装过后恒正所以引出下一个激活函数,激活函数 tanh:在 Tensorflow 中,用 tf.nn.tanh(),梯度弥散(在梯度过大时),所以激活函数 relu: 在 Tensorflow 中,用 tf.nn.relu()
  • 均方误差 mse:n 个样本的预测值 y 与已知答案 y_之差的平方和,再求平均值在 Tensorflow 中用 loss_mse = tf.reduce_mean(tf.square(y_ - y))
#0导入模块,生成数据集
import tensorflow as tf
import numpy as np
BATCH_SIZE = 8

X = np.random.rand(32,2)
Y_ = [[x1+x2+np.random.rand()/10.-0.05] for (x1,x2) in X]
#1定义神经网络的输入、参数和输出,定义前向传播过程。
x = tf.placeholder(tf.float32,shape=[None,2])
y_ = tf.placeholder(tf.float32,shape=[None,1])
w1 = tf.Variable(tf.random_normal([2,1],stddev=1))
y = tf.matmul(x,w1)
#2定义损失函数及反向传播方法。
#定义损失函数为MSE,反向传播方法为梯度下降。
loss_mse = tf.reduce_mean(tf.square(y-y_))
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss_mse)
#3生成会话,训练STEPS轮
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(20000):
        start = (i*BATCH_SIZE)%32
        end = start+BATCH_SIZE
        loss_mse_dis,_ = sess.run([loss_mse,train_step],feed_dict={x:X[start:end],y_:Y_[start:end]})
        if i %2000 == 0:
            print('After {}steps loss:{}'.format(i,loss_mse_dis))
    print('w1:',w1)
    print('w1:', sess.run(w1))
out:
After 0steps loss:0.22664415836334229
After 2000steps loss:0.03838531672954559
After 4000steps loss:0.01794697903096676
After 6000steps loss:0.009312132373452187
After 8000steps loss:0.005021141842007637
After 10000steps loss:0.0028489811811596155
After 12000steps loss:0.001746544730849564
After 14000steps loss:0.0011868053115904331
After 16000steps loss:0.0009025642066262662
After 18000steps loss:0.0007582578109577298
w1: <tf.Variable 'Variable:0' shape=(2, 1) dtype=float32_ref>
w1: [[0.9663933]
 [1.0228527]]

Process finished with exit code 0

  • 学习率 learning_rate:表示了每次参数更新的幅度大小。学习率过大,会导致待优化的参数在最
    小值附近波动,不收敛;学习率过小,会导致待优化的参数收敛缓慢。给出学习率设置的策略:指数衰减学习率:学习率随着训练轮数变化而动态更新。Learning_rate=LEARNING_RATE_BASELEARNING_RATE_DECAYLEARNING_RATE_STEP(更新学习率的频率,每隔多少轮批次进行更新)
    global_step = tf.Variable(0, trainable=False)
    learning_rate = tf.train.exponential_decay(
    LEARNING_RATE_BASE,
    global_step,
    LEARNING_RATE_STEP, LEARNING_RATE_DECAY,
    staircase=True/False)
    其中,LEARNING_RATE_BASE 为学习率初始值,LEARNING_RATE_DECAY 为学习率衰减率,global_step 记 录了当前训练轮数,为不可训练型参数。学习率 learning_rate 更新频率为输入数据集总样本数除以每次喂入样本数。若 staircase 设置为 True 时,表示 global_step/learning rate step 取整数,学习率阶梯型衰减;若 staircase 设置为 false 时,学习率会是一条平滑下降的曲线。
import tensorflow as tf
LEARNING_RATE_BASE = 0.1
LEARNING_RATE_DECAY = 0.99
LEARNING_RATE_STEP = 1

global_step = tf.Variable(0,trainable=False)
learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,
                                           global_step,
                                           LEARNING_RATE_STEP,
                                           LEARNING_RATE_DECAY,
                                           staircase=True)
w = tf.Variable(tf.constant(5.,tf.float32))
loss = tf.square(w+1)
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(40):
        w_dis,learning_rate_dis,loss_dis,global_step_dis,_ = sess.run([w,learning_rate,loss,global_step,train_step])
        print ("After %s steps: global_step is %f, w is %f, learning rate is %f, loss is %f" % (i, global_step_dis, w_dis, learning_rate_dis, loss_dis))
out:
After 0 steps: global_step is 1.000000, w is 5.000000, learning rate is 0.100000, loss is 36.000000
After 1 steps: global_step is 2.000000, w is 3.800000, learning rate is 0.099000, loss is 23.040001
After 2 steps: global_step is 3.000000, w is 2.849600, learning rate is 0.098010, loss is 14.819419
After 3 steps: global_step is 4.000000, w is 2.095001, learning rate is 0.097030, loss is 9.579033
After 4 steps: global_step is 5.000000, w is 1.494386, learning rate is 0.096060, loss is 6.221960
After 5 steps: global_step is 6.000000, w is 1.015166, learning rate is 0.095099, loss is 4.060895
After 6 steps: global_step is 7.000000, w is 0.631886, learning rate is 0.094148, loss is 2.663051
After 7 steps: global_step is 8.000000, w is 0.324608, learning rate is 0.093207, loss is 1.754587
After 8 steps: global_step is 9.000000, w is 0.077684, learning rate is 0.092274, loss is 1.161402
After 9 steps: global_step is 10.000000, w is -0.121202, learning rate is 0.091352, loss is 0.772287
After 10 steps: global_step is 11.000000, w is -0.281761, learning rate is 0.090438, loss is 0.515867
After 11 steps: global_step is 12.000000, w is -0.411674, learning rate is 0.089534, loss is 0.346128
  • 滑动平均:记录了一段时间内模型中所有参数 w 和 b 各自的平均值。利用滑动平均值可以增强模 型的泛化能力。
    ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step)
    其中,MOVING_AVERAGE_DECAY 表示滑动平均衰减率,一般会赋接近 1 的值,global_step 表示当前
    训练了多少轮。 √ema_op = ema.apply(tf.trainable_variables())
    其中,ema.apply()函数实现对括号内参数求滑动平均,tf.trainable_variables()函数实现把所有
    待训练参数汇总为列表。
    √with tf.control_dependencies([train_step, ema_op]): train_op = tf.no_op(name=‘train’)
    其中,该函数实现将滑动平均和训练过程同步运行。
    查看模型中参数的平均值,可以用 ema.average()函数。
    影子 = 衰减率 * 影子 +(1 - 衰减率)* 参数
    将 MOVING_AVERAGE_DECAY 设置为 0.99,参数 w1 设置为 0,w1 的滑动平均值设
    置为 0,开始时,轮数 global_step 设置为 0,参数 w1 更新为 1,则 w1 的滑动平均值为:
    w1 滑动平均值=min(0.99,1/10)*0+(1– min(0.99,1/10)*1 = 0.9
import tensorflow as tf

w1 = tf.Variable(0,dtype=tf.float32)
global_step = tf.Variable(0,trainable=False)
MOVING_AVERAGE_DECAY = 0.99
ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step)
ema_op = ema.apply(tf.trainable_variables())

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    print("current global_step:", sess.run(global_step))
    print("current w1", sess.run([w1, ema.average(w1)]) )

    sess.run(tf.assign(w1, 1))
    sess.run(ema_op)
    print("current global_step:", sess.run(global_step))
    print("current w1", sess.run([w1, ema.average(w1)]))

    sess.run(tf.assign(w1, 10))
    sess.run(ema_op)
    print("current global_step:", sess.run(global_step))
    print("current w1", sess.run([w1, ema.average(w1)]))
out:
current global_step: 0
current w1 [0.0, 0.0]
current global_step: 0
current w1 [1.0, 0.9]
current global_step: 0
current w1 [10.0, 9.09]

Process finished with exit code 0

  • 正则化:在损失函数中给每个参数 w 加上权重,引入模型复杂度指标,从而抑制模型噪声,减小
    过拟合。loss = loss(y 与 y_) + REGULARIZER*loss(w)
    L1 正则化: 𝒍𝒐𝒔𝒔𝑳𝟏 = ∑𝒊|𝒘𝒊| 用 Tesnsorflow 函数表示:loss(w) = tf.contrib.layers.l1_regularizer(REGULARIZER)(w)
    L2 正则化: 𝒍𝒐𝒔𝒔𝑳𝟐 = ∑𝒊|𝒘𝒊|^2用 Tesnsorflow 函数表示:loss(w) = tf.contrib.layers.l2_regularizer(REGULARIZER)(w)
    tf.add_to_collection(‘losses’, tf.contrib.layers.l2_regularizer(regularizer)(w)
    loss = cem + tf.add_n(tf.get_collection(‘losses’))
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
BATCH_SIZE = 30

X = np.random.randn(300,2)
Y_ = [int(x0**2+x1**2<2) for (x0,x1) in X]
Y_c = [['red' if y else 'blue'] for y in Y_]
X = np.vstack(X).reshape(-1,2)
Y_ = np.vstack(Y_).reshape(-1,1)
plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
plt.show()

def get_weight(shape,regularizer):
    w = tf.Variable(tf.random_normal(shape),dtype = tf.float32)
    tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w
def get_bias(shape):
    b = tf.Variable(tf.constant(0.01,shape=shape))
    return b

x = tf.placeholder(tf.float32,shape=[None,2])
y_ = tf.placeholder(tf.float32,shape=[None,1])
w1 = get_weight([2,11],0.01)
b1 = get_bias([11])
y1 = tf.nn.relu(tf.matmul(x,w1)+b1)

w2 = get_weight([11,1],0.01)
b2 = get_bias([1])
y = tf.matmul(y1,w2)+b2

loss_mse = tf.reduce_mean(tf.square(y-y_))
loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))

train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_mse)
with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    STEPS = 20000
    for i in range(STEPS):
        start = (i*BATCH_SIZE) % 300
        end = start + BATCH_SIZE
        sess.run(train_step, feed_dict={x:X[start:end], y_:Y_[start:end]})
        if i % 2000 == 0:
            loss_mse_v = sess.run(loss_mse, feed_dict={x:X, y_:Y_})
            print("After %d steps, loss is: %f" %(i, loss_mse_v))


    xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
    grid = np.c_[xx.ravel(), yy.ravel()]
    probs = sess.run(y, feed_dict={x: grid})
    probs = probs.reshape(xx.shape)
plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
plt.contour(xx, yy, probs, levels=[.5])
plt.show()
#采用正则化
train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_total)

with tf.Session() as sess:
	init_op = tf.global_variables_initializer()
	sess.run(init_op)
	STEPS = 20000
	for i in range(STEPS):
		start = (i*BATCH_SIZE) % 300
		end = start + BATCH_SIZE
		sess.run(train_step, feed_dict={x: X[start:end], y_:Y_[start:end]})
		if i % 2000 == 0:
			loss_v = sess.run(loss_total, feed_dict={x:X,y_:Y_})
			print("After %d steps, loss is: %f" %(i, loss_v))

	xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
	grid = np.c_[xx.ravel(), yy.ravel()]
	probs = sess.run(y, feed_dict={x:grid})
	probs = probs.reshape(xx.shape)
plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
plt.contour(xx, yy, probs, levels=[.5])
plt.show()


在这里插入图片描述

  • 模块化编程:1.generateds 2.forward 3. backward
#generateds
import numpy as np
seed = 2
def generateds():
    # 基于seed产生随机数
    rdm = np.random.RandomState(seed)
    # 随机数返回300行2列的矩阵,表示300组坐标点(x0,x1)作为输入数据集
    X = rdm.randn(300, 2)
    # 从X这个300行2列的矩阵中取出一行,判断如果两个坐标的平方和小于2,给Y赋值1,其余赋值0
    # 作为输入数据集的标签(正确答案)
    Y_ = [int(x0 * x0 + x1 * x1 < 2) for (x0, x1) in X]
    # 对数据集X和标签Y进行形状整理,第一个元素为-1表示跟随第二列计算,第二个元素表示多少列,可见X为两列,Y为1列
    X = np.vstack(X).reshape(-1, 2)
    Y_ = np.vstack(Y_).reshape(-1, 1)

    return X, Y_
#forward
#0导入模块 ,生成模拟数据集
import tensorflow as tf


# 定义神经网络的输入、参数和输出,定义前向传播过程 
def get_weight(shape, regularizer):
    w = tf.Variable(tf.random_normal(shape), dtype=tf.float32)
    tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w


def get_bias(shape):
    b = tf.Variable(tf.constant(0.01, shape=shape))
    return b


def forward(x, regularizer):
    w1 = get_weight([2, 11], regularizer)
    b1 = get_bias([11])
    y1 = tf.nn.relu(tf.matmul(x, w1) + b1)

    w2 = get_weight([11, 1], regularizer)
    b2 = get_bias([1])
    y = tf.matmul(y1, w2) + b2

    return y
#backward
#0导入模块 ,生成模拟数据集
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import forward
import generateds

STEPS = 40000
BATCH_SIZE = 30
LEARNING_RATE_BASE = 0.001
LEARNING_RATE_DECAY = 0.999
REGULARIZER = 0.01


def backward():
    x = tf.placeholder(tf.float32, shape=(None, 2))
    y_ = tf.placeholder(tf.float32, shape=(None, 1))

    X, Y_= generateds.generateds()

    y = forward.forward(x, REGULARIZER)

    global_step = tf.Variable(0, trainable=False)

    learning_rate = tf.train.exponential_decay(
        LEARNING_RATE_BASE,
        global_step,
        300 / BATCH_SIZE,
        LEARNING_RATE_DECAY,
        staircase=True)

    # 定义损失函数
    loss_mse = tf.reduce_mean(tf.square(y - y_))
    loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))

    # 定义反向传播方法:包含正则化
    train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss_total)

    with tf.Session() as sess:
        init_op = tf.global_variables_initializer()
        sess.run(init_op)
        for i in range(STEPS):
            start = (i * BATCH_SIZE) % 300
            end = start + BATCH_SIZE
            sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
            if i % 2000 == 0:
                loss_v = sess.run(loss_total, feed_dict={x: X, y_: Y_})
                print("After %d steps, loss is: %f" % (i, loss_v))
if __name__ == '__main__':
    backward()
After 30000 steps, loss is: 0.090862
After 32000 steps, loss is: 0.090853
After 34000 steps, loss is: 0.090845
After 36000 steps, loss is: 0.090838
After 38000 steps, loss is: 0.090823
1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值