Tensorflow学习总结三

最新推荐文章于 2023-05-15 16:58:02 发布

弘彰

最新推荐文章于 2023-05-15 16:58:02 发布

阅读量366

点赞数

分类专栏： Tensorflow 机器学习文章标签： Tensorflow 神经网络

本文链接：https://blog.csdn.net/weixin_42216171/article/details/88560284

版权

机器学习同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

Tensorflow

4 篇文章 0 订阅

订阅专栏

Tensorflow进阶——优化神经网络

1、过拟合与正则化
过拟合：神经网络模型在训练数据集上的准确率较高，在新的数据进行预测或分类时准确率较低，说明模型的泛化能力差。
正则化：在损失函数中给每个参数 w 加上权重，引入模型复杂度指标，从而抑制模型噪声，减小过拟合。使用正则化后，损失函数 loss 变为两项之和：
loss = loss(y 与 y_) + REGULARIZER*loss(w)
其中，第一项是预测结果与标准答案之间的差距，如之前讲过的交叉熵、均方误差等；第二项是正则化计算结果。
正则化计算方法：
① L1 正则化： ??????= ∑|?|
用 Tesnsorflow 函数表示:

 loss(w) = tf.contrib.layers.l1_regularizer(REGULARIZER)(w)

② L2 正则化： ?????2= ∑?^2
用 Tesnsorflow 函数表示:

 loss(w) = tf.contrib.layers.l2_regularizer(REGULARIZER)(w)

用 Tesnsorflow 函数实现l2正则化：

tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w) 
loss = cem + tf.add_n(tf.get_collection('losses'))

例如：
用 300 个符合正态分布的点 X[x0, x1]作为数据集，根据点 X[x0, x1]计算生成标注 Y_，将数据集标注为红色点和蓝色点。标注规则为：当 (x0)^2 _+ (x1)^2< 2 时，y_=1，标注为红色；当 (x0)^2 _+ (x1)^2 ≥2 时，y_=0，标注为蓝色。我们分别用无正则化和有正则化两种方法，拟合曲线，把红色点和蓝色点分开。
代码如下：

#coding:utf-8
#0导入模块，生成模拟数据库
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

BATCH_SIZE = 30
seed = 2
rdm = np.random.RandomState(seed)
X = rdm.randn(300,2)
Y_ = [int(x0*x0 +x1*x1 < 2) for (x0,x1) in X]
Y_c = [['red' if y else 'blue'] for y in Y_]
#对数据集X和标签Y进行shape整理，第一个元素为-1表示，随第二个参数计算得到，第二个元素表示多少列，把X整理成n行2列
#np.vstack将数组按垂直方向叠加，先叠加成一个数组，再reshape成想要的维度
X = np.vstack(X).reshape(-1,2)
#把Y整理为n行1列
Y_ = np.vstack(Y_).reshape(-1,1)
# print(X,'\n',Y_,'\n',Y_c)

#画散点图
plt.scatter(X[:,0],X[:,1],c = np.squeeze(Y_c))
plt.show()

#定义神经网络的输入、参数和输出，定义前向传播过程,加入正则化项
def get_weight(shape,regularizer):
    w = tf.Variable(tf.random_normal(shape),dtype=tf.float32)
    tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w

def get_bias(shape):
    b = tf.Variable(tf.constant(0.01,shape=shape))
    return b

x = tf.placeholder(tf.float32,shape=(None,2))
y_ = tf.placeholder(tf.float32,shape=(None,1))

w1 = get_weight([2,11],0.01)
b1 = get_bias([11])
y1 = tf.nn.relu(tf.matmul(x,w1)+b1)

w2 = get_weight([11,1],0.01)
b2 = get_bias([1])
y = tf.matmul(y1,w2)+b2

#定义损失函数
loss_mse = tf.reduce_mean(tf.square(y-y_))
loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))

#定义反向传播方法：不含正则化
train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_mse)

with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    STEPS = 40000
    for i in range(STEPS):
        start = (i+BATCH_SIZE)%300
        end = start +BATCH_SIZE
        sess.run(train_step,feed_dict={x:X[start:end],y_:Y_[start:end]})
        if i %2000 == 0:
            loss_mse_v = sess.run(loss_mse,feed_dict={x:X,y_:Y_})
            print("After %d steps,loss is : %f"%(i,loss_mse_v))
        #绘制预测模型的分界线，用xx，yy作为测试集
        #xx在-3到3之间以步长为0.01，yy在-3到3之间步长为0.01，生成二维网络坐标点
        xx,yy = np.mgrid[-3:3:0.01,-3:3:0.01]
        #将xx，yy拉直，合并成一个二列的矩阵，得到一个网格坐标点的集合
        grid = np.c_[xx.ravel(),yy.ravel()]
        #将网格坐标点喂入神经网络，probs为输出
        probs = sess.run(y,feed_dict={x:grid})
        #probs的shape调整成xx的样子
        probs = probs.reshape(xx.shape)
        # print("w1:\n",sess.run(w1))
        # print("b1:\n", sess.run(b1))
        # print("w2:\n", sess.run(w2))
plt.scatter(X[:,0],X[:,1],c = np.squeeze(Y_c))
plt.contour(xx,yy,probs,levels = [.5])
plt.show()


#定义反向传播方法：含正则化
train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_total)

with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    STEPS = 40000
    for i in range(STEPS):
        start = (i+BATCH_SIZE)%300
        end = start +BATCH_SIZE
        sess.run(train_step,feed_dict={x:X[start:end],y_:Y_[start:end]})
        if i %2000 == 0:
            loss_total_v = sess.run(loss_total,feed_dict={x:X,y_:Y_})
            print("After %d steps,loss is : %f"%(i,loss_total_v))
        #绘制预测模型的分界线
        #xx在-3到3之间以步长为0.01，yy在-3到3之间步长为0.01，生成二维网络坐标点
        xx,yy = np.mgrid[-3:3:0.01,-3:3:0.01]
        #将xx，yy拉直，合并成一个二列的矩阵，得到一个网格坐标点的集合
        grid = np.c_[xx.ravel(),yy.ravel()]
        #将网格坐标点喂入神经网络，probs为输出
        probs = sess.run(y,feed_dict={x:grid})
        #probs的shape调整成xx的样子
        probs = probs.reshape(xx.shape)
        # print("w1:\n",sess.run(w1))
        # print("b1:\n", sess.run(b1))
        # print("w2:\n", sess.run(w2))
plt.scatter(X[:,0],X[:,1],c = np.squeeze(Y_c))
plt.contour(xx,yy,probs,levels = [.5])
plt.show()

没有正则化的分割曲线
正则化之后的分割曲线
2、搭建模块化神经网络八股
前向传播：由输入到输出，搭建完整的网络结构，描述前向传播的过程需要定义三个函数：

def forward(x, regularizer):
	w=
	b=
	y=
	return y

forward()函数完成网络结构的设计，从输入到输出搭建完整的网络结构，实现前向传播过程。该函数中，参数 x 为输入，regularizer 为正则化权重，返回值为预测或分类结果 y。

def get_weight(shape, regularizer):   
	w = tf.Variable(    )  
	tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w))  
	return w

get_weight()函数对参数 w 设定。该函数中，参数 shape 表示参数 w 的形状，regularizer表示正则化权重，返回值为参数 w。其中，tf.variable()给 w 赋初值，tf.add_to_collection()表示将参数 w 正则化损失加到总损失losses 中。

def get_bias(shape):       
	b = tf.Variable(    )      
	return b

get_bias()函数对参数 b 进行设定。该函数中，参数 shape 表示参数 b 的形状,返回值为参数b。其中，tf.variable()表示给 b 赋初值。
反向传播：训练网络，优化网络参数，提高模型准确性。
def backward( ):
x = tf.placeholder( )
y_ = tf.placeholder( )
y = forward.forward(x, REGULARIZER)
global_step = tf.Variable(0, trainable=False)
loss =
函数backward()中，placeholder()实现对数据集 x 和标准答案 y_占位，forward.forward()实现前向传播的网络结构，参数 global_step 表示训练轮数，设置为不可训练型参数。
在训练网络模型时，常将正则化、指数衰减学习率和滑动平均这三个方法作为模型优化方法。
例如，对上例进行模块化编程，代码总共分为三个模块：生成数据集(generateds.py) 、前向传播 (forward.py)、反向传播(backward.py)。

①生成数据集的模块(generateds.py) ：

# coding:utf-8
import tensorflow as tf
import numpy as np

seed = 2
def generateds():
    rdm = np.random.RandomState(seed)
    X = rdm.randn(300, 2)
    Y_ = [int(x0 * x0 + x1 * x1 < 2) for (x0, x1) in X]
    Y_c = [['red' if y else 'blue'] for y in Y_]
    # 对数据集X和标签Y进行shape整理，第一个元素为-1表示，随第二个参数计算得到，第二个元素表示多少列，把X整理成n行2列
    # np.vstack将数组按垂直方向叠加，先叠加成一个数组，再reshape成想要的维度
    X = np.vstack(X).reshape(-1, 2)
    # 把Y整理为n行1列
    Y_ = np.vstack(Y_).reshape(-1, 1)
    # print(X,'\n',Y_,'\n',Y_c)
    return (X, Y_, Y_c)

②前向传播模块(forward.py)

#coding:utf-8
import tensorflow as tf

#定义神经网络的输入、参数和输出，定义前向传播过程,加入正则化项
def get_weight(shape,regularizer):
    w = tf.Variable(tf.random_normal(shape),dtype=tf.float32)
    tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w

def get_bias(shape):
    b = tf.Variable(tf.constant(0.01,shape=shape))
    return b

def forward(x, regularizer):
	w1 = get_weight([2,11],regularizer)
	b1 = get_bias([11])
	y1 = tf.nn.relu(tf.matmul(x,w1) +b1)
	
	w2 = get_weight([11,1],regularizer)
	b2 = get_bias([1])
	y = tf.matmul(y1,w2) +b2    #输出层不激活
	
	return y

③反向传播模块(backward.py)

# coding:utf-8
# 0导入模块，生成模拟数据库
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import generateds
import forward

BATCH_SIZE = 30
STEPS = 40000
LEARNING_RATE_BASE = 0.001
LEARNING_RATE_DECAY = 0.999
REGULARIZER = 0.01

def backward():
    x = tf.placeholder(tf.float32, shape=(None, 2))
    y_ = tf.placeholder(tf.float32, shape=(None, 1))

    X, Y_, Y_c = generateds.generateds()
    y = forward.forward(x, REGULARIZER)
    global_step = tf.Variable(0, trainable=False)
    learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE, global_step, 300 / BATCH_SIZE,
                                               LEARNING_RATE_DECAY,staircase=True)
    #定义损失函数
    loss_mse = tf.reduce_mean(tf.square(y-y_))
    loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))

    #定义反向传播方法
    train_step =tf.train.AdamOptimizer(learning_rate).minimize(loss_total)
    with tf.Session() as sess:
        init_op = tf.global_variables_initializer()
        sess.run(init_op)
        STEPS = 40000
        for i in range(STEPS):
            start = (i + BATCH_SIZE) % 300
            end = start + BATCH_SIZE
            sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
            if i % 2000 == 0:
                loss_total_v = sess.run(loss_total, feed_dict={x: X, y_: Y_})
                print("After %d steps,loss is : %f" % (i, loss_total_v))
            # 绘制预测模型的分界线
            xx, yy = np.mgrid[-3:3:0.01, -3:3:0.01]
            grid = np.c_[xx.ravel(), yy.ravel()]
            probs = sess.run(y, feed_dict={x: grid})
            probs = probs.reshape(xx.shape)

    plt.scatter(X[:, 0], X[:, 1], c=np.squeeze(Y_c))
    plt.contour(xx, yy, probs, levels=[.5])
    plt.show()

if __name__ =='__main__':
    backward()

声明：本博客为学习《人工智能实践：Tensorflow笔记》：
https://www.icourse163.org/learn/PKU-1002536002#/learn/content
后总结，非常感谢曹健老师教授，转载请说明出处。

弘彰

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Tensorflow学习总结三

Tensorflow进阶——优化神经网络1、过拟合与正则化过拟合：神经网络模型在训练数据集上的准确率较高，在新的数据进行预测或分类时准确率较低，说明模型的泛化能力差。正则化：在损失函数中给每个参数 w 加上权重，引入模型复杂度指标，从而抑制模型噪声，减小过拟合。使用正则化后，损失函数 loss 变为两项之和：loss = loss(y 与 y_) + REGULARIZER*loss(w...
复制链接

扫一扫