# 机器学习笔记：正则化项

一部分内容在数学上的定义更为广泛，大家可以看看相关资料。这里只涉及了一些和机器学习有关的定义，可能不是很严谨

# L-p范数

L-p范数定义如下
$L^{p} = (\sum_{i}{|x_{i}|^{p}})^{\frac{1}{p}}$

## L0范数

L0范数为参数向量中非零值的个数
$||x||_{0} = \sum_{i}{I(x_{i} \neq 0)}$

## L1范数

L1范数为参数向量中元素绝对值之和
$||x||_{1} = \sum_{i}{|x_{i}|}$

## L2范数

L2范数为参数向量的欧氏距离
$||x||_{2} = \sqrt{\sum_{i}{x_{i}^{2}}}$

# 不同正则化项的区别

## L1正则化

$Loss = C_{0} + \frac{\lambda}{n} ||w||_{1}$

$\frac{\partial Loss}{\partial w_{i}} = \frac{\partial C_{0}}{\partial w_{i}} + \frac{\lambda}{n} * \frac{\partial \sum_{j}{|w_{j}|}}{\partial w_{i}} \\ = \frac{\partial C_{0}}{\partial w_{i}} + \frac{\lambda}{n} sign(w_{i})$

$w_{i} =: w_{i} - \eta (\frac{\partial C_{0}}{\partial w_{i}} + \frac{\lambda}{n} sign(w_{i}))$

L1正则化因为存在绝对值函数，所以并不能保证所有点都可导，因此也有很多额外的理论研究

## L2正则化

$Loss = C_{0} + \frac{\lambda}{2 n} ||w||_{2}^{2}$

$\frac{\partial Loss}{\partial w_{i}} = \frac{\partial C_{0}}{\partial w_{i}} + \frac{\lambda}{2n} * \frac{\partial \sum_{j}{w^{2}}}{\partial w_{i}} \\ = \frac{\partial C_{0}}{\partial w_{i}} + \frac{\lambda}{n} w_{i}$

$w_{i} =: w_{i} - \eta (\frac{\partial C_{0}}{\partial w_{i}} + \frac{\lambda}{n} w_{i}) \\ = (1 - \frac{\eta \lambda}{n}) w_{i} - \eta \frac{\partial C_{0}}{\partial w_{i}}$