large margin-人脸识别

最近新出的 角度间隔使得在lfw 数据库上能达到一个比较好的效果。

重新读了一下large margin 的文章,感觉如出一辙。

思想是一样的,要把两个有重叠的区域,尽可能的分开。

largin margin 的 github 实现:

https://github.com/wy1iu/LargeMargin_Softmax_Loss

根据这个可以实现 iccv 2017 的角度间隔。

paper 地址:http://jmlr.org/proceedings/papers/v48/liud16.pdf 

我们现在来看一下 作者实现的large margin 里面的这几个参数具体指的是啥意思吧:

layer {
  name: "ip2"
  type: "LargeMarginInnerProduct"
  bottom: "bn_ip"
  bottom: "label"
  top: "ip2"
  top: "lambda"
  param {
    name: "ip2"
    lr_mult: 1
  }
  largemargin_inner_product_param {
    num_output: 10            // mnist 一共有十类
    type: QUADRUPLE       
    base: 1000
    gamma: 0.000025
    power: 35
    iteration: 0
    lambda_min: 0  
    weight_filler {
      type: "msra"
    }
  }
  include {
    phase: TRAIN
  }
}
作者对这几个参数的解释:

  • L-Softmax loss is the combination of "LargeMarginInnerProduct" layer and "SoftmaxWithLoss" layer. 
  • L-Softmax loss 是两个函数的组合
  • If the type of the layer is SINGLE/DOUBLE/TRIPLE/QUADRUPLE, then m is set as 1/2/3/4 respectively.
  • mnist example can be run directly after compilation. cifar10 and cifar10+ requires datasets to be downloaded first.
  • base, gamma, power and lambda_min are parameters for exponential lambda descent. 
  • lambda represents the approximation level to the proposed L-Softmax loss (refer to the experimental details in the ICML'16 paper). lambda will be decreased by the equation: lambda = max(lambda_min,base*(1+gamma*iteration)^(-power)). 
  • It is strong recommended that the user visualizes the lambda descent function before using the loss. The parameter selection is very flexible. Typically, when the optimization is finished, lambda should a sufficiently small value. Also note that, lambda is not always necessary. For MNIST dataset, the L-Softmax loss can work perfectly without lambda. Setting base to 0 can remove the lambda.
  • lambda_min can vary according to the difficulty of datasets. For easy datasets such as mnist and cifar10, lambda_min can be zero. For large and difficult datasets, you should first try to set lambda_min as 5 or 10. There is no specific rule to set lambda_min, but generally, it should be as small as possible.
  • Both ReLU and PReLU work well with L-Softmax loss. Empirically, PReLU helps L-Softmax converge easier.
  • Batch normalization could help the L-Softmax network converge much easier. It is strong recommended to use it.



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值