# 深度学习之正则化系列（1）：深入理解参数范数惩罚(L1正则化、L2正则化)原理及tensorflow实现

#### 2、常见正则化方法

• 参数范数惩罚
• 作为约束的范数惩罚
• 数据集增强
• 提前终止
• Dropout

$J\left(\theta ；X，y\right)=\frac{1}{2}\sum _{i=1}^{n}\left(y-{y}^{{}^{\prime }}{\right)}^{2}$

$J\left(\theta ；X，y\right)=\frac{1}{2}\sum _{i=1}^{n}\left(y-{y}^{{}^{\prime }}{\right)}^{2}+\alpha \mathrm{\Omega }\left(\theta \right)$

#### 3、L2正则化

$\mathrm{\Omega }\left(\theta \right)=\frac{1}{2}‖w{‖}_{2}^{2}=\frac{1}{2}\sum _{i=1}^{n}‖{w}_{i}{‖}^{2}$

$\stackrel{^}{J}\left(w;X,y\right)=\alpha {w}^{T}w+J\left(w;X,y\right)$

${\mathrm{\nabla }}_{w}\stackrel{^}{J}\left(w;X,y\right)=\alpha w+{\mathrm{\nabla }}_{w}J\left(w;X,y\right)$

$w←w-\beta \left(\alpha w+{\mathrm{\nabla }}_{w}J\left(w;X,y\right)\right)$

$\beta$$\beta$ 为梯度下降的步长，最终我们得到了：

$w←\left(1-\beta \alpha \right)w-{\mathrm{\nabla }}_{w}J\left(w;X,y\right)$

L2正则化能让学习算法 ‘‘感知’’ 到具有较高方差的输入 x，因此与输出目标的协方差较小（相对增加方差）的特征的权重将会收缩。

import tensorlfow as tf
tf.contrib.layers.l2_regularizer(
scale,
scope=None
)
"""
Returns a function that can be used to apply L2 regularization to weights.
Small values of L2 can help prevent overfitting the training data.
Args:
scale: A scalar multiplier Tensor. 0.0 disables the regularizer.
scope: An optional scope name.
"""
# 具体的实现方式
def get_weight(shape):
return tf.Variable(tf.random_normal(shape),dtype=tf.float32)
def get_loss(shape,lambda):
var = get_weight(shape)
loss = tf.reduce_mean(tf.square(y_ - cur_layer))+tf.contrib.layers.l2_regularizer(lambda)(var))
return loss

#### 4、L1正则化

$\mathrm{\Omega }\left(\theta \right)=\frac{1}{2}‖w{‖}_{1}=\frac{1}{2}\sum _{i=1}^{n}|{w}_{i}|$

$\stackrel{^}{J}\left(w;X,y\right)=\alpha ‖w{‖}_{1}+J\left(w;X,y\right)$

${\mathrm{\nabla }}_{w}\stackrel{^}{J}\left(w;X,y\right)=\alpha sign\left(w\right)+{\mathrm{\nabla }}_{w}J\left(w;X,y\right)$

tf.contrib.layers.l1_regularizer(
scale,
scope=None
)
#后面的就不详写了，和上文代码一样。