损失函数L1正则化稀疏性

机器学习算法中为了防止过拟合,会在损失函数中加上惩罚因子,即为L1/L2正则化。因此最终的优化目标函数为:
f(x) = L(x) + C*Reg(x) , C > 0
本文只讲解L1正则化,因此Reg(x) = |x|
首先L(x)和Reg(x)都是连续函数,因此f(x)也是连续函数;其次L(x)和Reg(x)都是凸函数,因此f(x)也是凸函数;所以f(x)是有最优解的。而|x|仅在x=0时是不可导的,
当f(x)有最优解0时的必要条件为, f(x)在x=0处的左导数与右导数异号,即
(L’(0) + C)(L’(0) - C) < 0 得到C > |L’(0)|

下面举一个例子说明:
目标函数为f(x,y) = (x-1)^2+(y+2)^2 + 3.0(|x|+|y|)
使用SGD得到最优解, 代码如下:

#!/usr/bin/env python
#-*- coding:utf8 -*-
import sys

def sign(x):
    if x > 0:
        return 1
    elif x < 0:
        return -1
    else:
        return 0

learning_rate = 0.001
C = 3.0
def solution(derive, init_pnt):
    dim_num = len(init_pnt)
    pnt = [x for x in init_pnt]
    print pnt, derive
    for i in xrange(10000):
        deri_arr = []
        for d in xrange(dim_num):
            deri = derive[d](pnt[d])
            deri_arr.append(deri)
            pnt[d] = pnt[d] - learning_rate * deri
        print pnt, deri_arr
    return pnt

if __name__ == "__main__":
    # y = (x - 1)**2 + (y + 1) ** 2 + C(|x| + |y|)
    derive = [lambda x: 2*(x-1)+C*sign(x), lambda y: 2*(y+2)+C*sign(y)]
    init_pnt = [5, 5]
    point = solution(derive, init_pnt)

    print "minimum point of f(x,y) = (x-1)^2+(y+2)^2 + %s(|x|+|y|) : (%s,%s)" % (C, point[0], point[1])

迭代10000次,结果如下:
[0.004361948510801818, -0.4999999975259809] [-5.001278660298994, 4.957954047313251e-09]
[0.0033532246137802147, -0.49999999753092894] [1.0087238970216037, 4.948038423435719e-09]
[0.002346518164552654, -0.4999999975358671] [1.0067064492275604, 4.93814233948342e-09]
[0.0013418251282235488, -0.4999999975407954] [1.0046930363291053, 4.928265795456355e-09]
[0.0003391414779671017, -0.4999999975457138][1.0026836502564471, 4.9184092354437325e-09]
[-0.0006615368049888324, -0.49999999755062235][1.0006782829559342, 4.908572215356344e-09]
[0.004339786268621145, -0.4999999975555211] [-5.001323073609978, 4.898755179283398e-09]
[0.0033311066960839027, -0.4999999975604101] [1.0086795725372424, 4.888957683135686e-09]
[0.002324444482691735, -0.49999999756528923] [1.0066622133921679, 4.8791797269132076e-09]
[0.0013197955937263512, -0.49999999757015867] [1.0046488889653835, 4.869421310615962e-09]
minimum point of f(x,y) = (x-1)^2+(y+2)^2 + 3.0(|x|+|y|) : (0.00131979559373,-0.49999999757)

这个例子中(x,y)=(0,0)时,
f’_(x) f’+(x)= (-2-3) (-2+3) < 0
f’_(y) f’+(y)= (4-3) (4+3) > 0
所以f(x,y)在x维的最优解为0,在y维的最优解不为0。反映在上面的迭代过程中,也是x维接近为0,y维远偏离于0。从加粗的数来看,x维的导数从正数变为负数又变为正数,导数在0处左右波动,表明最优解就在附近。

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值