TensorFlow学习笔记（八）——避免过拟合，正则化以及其他方法

最新推荐文章于 2022-11-21 15:03:47 发布

柠檬巧克力、

最新推荐文章于 2022-11-21 15:03:47 发布

阅读量374

点赞数

文章标签： tensorflow 深度学习机器学习

本文链接：https://blog.csdn.net/qq_35535616/article/details/107207091

版权

一、正则化

此处需要学习现代数值计算方法，范数知识。
1、二范数指矩阵A的2范数，就是A的转置共轭矩阵与矩阵A的积的最大特征根的平方根值，是指空间上两个向量矩阵的直线距离。类似于求棋盘上两点间的直线距离。（向量元素平方和再开方）
2、一范数指的是向量元素的绝对值之和。
3、无穷范数指的是向量元素中的最大值。

1.正则化

在这里插入图片描述
记损失函数和正则项分别为：

因此引入正则的被优化项为：

其中λ为提前挑选的值，控制我们偏好小范数权重的程度（越大偏好的范数就越小）
L1正则化如下：

L2正则化如下;

二者特点：
①L1正则化会让参数变得更稀疏
②L1正则化的公式不可导
因此在优化时计算L2正则化损失函数的偏导数会更加简洁。二者可同时使用。

L2正则化使用：

tf.contrib.layers.l2_regularizer(lambda)

在tf.2.0v中，contrib集成在以下几个包中：

tf.keras.layers.Layer
tf.keras.Model
tf.Module

正则化示例：

import tensorflow as tf
tf.compat.v1.disable_eager_execution()
weights = tf.constant([[1.0,2.0],[3.0,4.0]])
regularizer_l2 = tf.keras.regularizers.l2(.5)
regularizer_l1 = tf.keras.regularizers.l1(.5)
with tf.compat.v1.Session() as sess:
    print(sess.run((regularizer_l2(weights )))/2)
    print(sess.run(regularizer_l1(weights)))

结果如下：
......
......
2020-07-24 10:36:36.092443: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-24 10:36:36.100798: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x151f68afe80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-24 10:36:36.101104: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-24 10:36:36.101388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-24 10:36:36.101622: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      
7.5
5.0

2.交叉熵的正则化损失函数

import tensorflow as tf
import tensorflow.compat.v1 as tf1
import numpy as np
tf.compat.v1.disable_eager_execution()
training_steps = 30000
#训练轮数
data = []
label = []
for i in range(200):
    x1 = np.random.uniform(-1,1)
    x2 = np.random.uniform(0,2)
    #均匀分布中随机采样，输入最小，最大值，float型
    if x1**2 + x2**2 <= 1:
        data.append([np.random.normal(x1,0.1),np.random.normal(x2,0.1)])
        label.append(0)
    else:
        data.append([np.random.normal(x1,0.1),np.random.normal(x2,0.1)])
        label.append(1)
    #对x1和x2进行判断，如果产生的点落在半径为1的圆内，则label为0，否则取1
data = np.hstack(data).reshape(-1,2)
label = np.hstack (label).reshape(-1,1)
#hstack(tup) ，参数tup可以是元组，列表，或者numpy数组，返回结果为numpy的数组
#1维数组经过reshape变成其他维度
def hidden_layer(input_tensor,weight1,bias1,weight2,bias2,weight3,bias3):
    layer1 = tf.nn.relu(tf.matmul(input_tensor ,weight1) + bias1)
    layer2 = tf.nn.relu(tf.matmul(layer1,weight2 ) + bias2 )
    return tf.matmul(layer2,weight3 ) + bias3
#定义前向传播的隐层
x = tf1.placeholder(tf.float32,shape = (None ,2), name = "x-input")
y_ = tf1.placeholder(tf.float32, shape = (None , 1), name = "y-output")
weight1 = tf.Variable(tf1.truncated_normal([2,10],stddev=0.1))
bias1 = tf.Variable(tf.constant(0.1,shape=[10]))
weight2 = tf.Variable(tf1.truncated_normal([10,10],stddev=0.1))
bias2 = tf.Variable(tf.constant(0.1,shape=[10]))
weight3 = tf.Variable(tf1.truncated_normal([10,1],stddev=0.1))
bias3 = tf.Variable(tf.constant(0.1,shape=[10]))
#定义权重参数和偏置参数
sample_size = len(data)
#计算data的长度
y = hidden_layer(x,weight1 ,bias1 ,weight2 ,bias2,weight3,bias3)
#隐层前向传播的结果
error_loss = tf.reduce_sum(tf.pow(y_ -y, 2))/sample_size
tf1.add_to_collection("losses",error_loss )
#自定义损失函数，用于衡量计算值和实际值的误差
regularizer = tf.keras.regularizers.l2(0.01)
regularization = (regularizer(weight1) + regularizer(weight2) + regularizer(weight3))/2
tf1.add_to_collection("losses",regularization )
#对权重参数实现正则化
#add_to_collection(‘list_name’, element)：将元素element添加到列表list_name中
loss = tf.add_n(tf1.get_collection("losses"))
#get_collection(‘list_name’)：返回名称为list_name的列表
train_op = tf1.train.AdamOptimizer(0.01).minimize(loss)
#学习率为0.01的优化器
with tf1.Session() as sess:
    tf1.global_variables_initializer().run()
    for i in range(training_steps):
        sess.run(train_op ,feed_dict= {x: data,y_: label})
        if i % 2000 == 0:
            loss_value = sess.run(loss,feed_dict = {x: data,y_:label})
            print("After %d steps, mse_loss: %f" %(i,loss_value ))
            print(x,y,y_)
            print(weight1.eval() ,weight2.eval() ,weight3.eval() ,bias1.eval(),bias2.eval(),bias3.eval())

柠檬巧克力、

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
TensorFlow学习笔记（八）——避免过拟合，正则化以及其他方法

一、正则化此处需要学习现代数值计算方法，范数知识。1、二范数指矩阵A的2范数，就是A的转置共轭矩阵与矩阵A的积的最大特征根的平方根值，是指空间上两个向量矩阵的直线距离。类似于求棋盘上两点间的直线距离。（向量元素平方和再开方）2、一范数指的是向量元素的绝对值之和。3、无穷范数指的是向量元素中的最大值。1.正则化记损失函数和正则项分别为：因此引入正则的被优化项为：其中λ为提前挑选的值，控制我们偏好小范数权重的程度（越大偏好的范数就越小）...
复制链接

扫一扫