人脸识别 Lightened CNN

最新推荐文章于 2024-08-21 08:46:37 发布

NineDays66

最新推荐文章于 2024-08-21 08:46:37 发布

阅读量2.8k

点赞数 1

分类专栏：人脸处理文章标签： face recongnized

本文链接：https://blog.csdn.net/u011808673/article/details/80556545

版权

人脸处理专栏收录该内容

63 篇文章 9 订阅

订阅专栏

论文地址为：https://arxiv.org/abs/1511.02683

代码地址：

https://github.com/AlfredXiangWu/face_verification_experiment

正如前面《人脸验证：DeepID》博客所说，人脸验证任务中需要关心两个问题：一个是人脸特征提取，另一个就是如何判断是不是同一个人。特征提取的方法有LBP等传统方法，也有DeepID这样的深度学习方法。判断是不是同一个人的方法简单的有余弦相似度，复杂的有Joint Bayesian。本文主要的内容集中于人脸特征提取，就是使用Lighten CNN提取特征。

概述

为了得到更好的准确度，深度学习的方法都趋向更深的网络和多个模型ensemble，这样导致模型很大，计算时间长。本文提出一种轻型的CNN，在取得比较好的效果同时，网络结构简化，时间和空间都得到了优化，可以跑在嵌入式设备和移动设备上。

优势在于一个很小的模型和一个非常不错的识别率。主要原因在于，

（1）作者使用maxout作为激活函数，实现了对噪声的过滤和对有用信号的保留，从而产生更好的特征图MFM(Max-Feature-Map)。这个思想非常不错，本人将此思想用在center_loss中，实现了大概0.5%的性能提升，同时，这个maxout也就是所谓的slice+eltwise，这2个层的好处就是，一，不会产生训练的参数，二，基本很少耗时，给人的感觉就是不做白不做，性能还有提升。

（2）作者使用了NIN(Network inNetwork)来减少参数，并提升效果，作者提供的A模型是没有NIN操作的，B模型是有NIN操作的，2个模型的训练数据集都是CASIA，但是性能有0.5%的提升，当然代价是会有额外参数的产生。但是相比其他网络结构，使用NIN还是会使模型小不少，作者论文中的网络结构和B,C模型相对应。

网络结构

MFM：就是比较两个特征图各位置的大小，取对应位置大的值。使用 caffe 的 Eltwise 层。MFM激活函数相比于Relu的优点，主要是它可以学习紧凑的特征而不是Relu那样稀疏高维的。

architecture

本文网络结构如上图所示，和DeepID一样，在训练时使用人脸分类的任务进行训练，最后得到256维的人脸特征。具体而言，网络结构如下，文章提出了两种结构，网络的主要结构是一样的，文章更多是集中在了第一种结构。

网络最后一层是Sofmax层，实现分类的目的，fc1的结果就是人脸的特征。

MFM激活函数

本文使用了一种称为MFM的激活函数，这个结构也很简单。在输入的卷积层中，选择两层，取相同位置较大的值。

MFM

写成公式：

输入的卷积层为2n层，取第k层和第k+n层中较大的值作为输出，MFM输出就变成了n层。激活函数的梯度为

这样激活层有一半的梯度为0，MFM可以得到稀疏的梯度。MFM激活函数相比于ReLU函数，ReLU函数得到的特征是稀疏高维的，MFM可以得到紧实（compact）的特征，还能实现特征选择和降维的效果。

实验

本文使用的数据集是CASIA-WebFace，有10575个人的493456张照片。训练使用了Caffe。输入图片为144*144的黑白图片，随机裁剪成128*128的大小。全连接层Dropout设置为0.7。不同层SGD的参数也不一样，前面除了fc2层，momentum设为0.9，weight decay为5e-4，fc2层为了防止过拟合，weight decay为5e-3。learning rate从1e-3降到5e-5。最终在GTX980上训练了两周。

结果

获取特征后，作者简单使用cosine similarity进行人脸验证。在LFW上，model A正确率为97.77%，model B为98.13%。可以看出这个结果是可以接受的。文章的模型A为26M，在i7-4790上测试一张图片的时间为71ms，我在骁龙808上测试是0.8s。

table

总结

本文网络属于轻量级结构，模型相对较小，前向计算速度快，能够在嵌入式设备上使用。虽然精度没有达到最高，但是属于可以接受的范围。

附 lightcnn_train_test.prototxt

name: "DeepFace_set003_net"

layer {
  name: "data"
  type:"ImageData"
  top: "data"
  top: "label"
  image_data_param{
      source: "/home/himon/code/caffe-master/lightCNNFace/train.txt"
      batch_size: 20
      shuffle: true
    }
  transform_param {
    scale: 0.00390625
    crop_size: 128
    mirror: true

  }
  include: { phase: TRAIN }
}

layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  image_data_param{
      source: "/home/himon/code/caffe-master/lightCNNFace/val.txt"
      batch_size: 20
    }
  transform_param {
    scale: 0.00390625
    crop_size: 128
    mirror: false
  }
  include: { phase: TEST }
}

layer{
  name: "conv1"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 5
    stride: 1
    pad: 2
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
  bottom: "data"
  top: "conv1"
}

layer{
  name: "slice1"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv1"
  top: "slice1_1"
  top: "slice1_2"
}
layer{
  name: "etlwise1"
  type: "Eltwise"
  bottom: "slice1_1"
  bottom: "slice1_2"
  top: "eltwise1"
  eltwise_param {
    operation: MAX
  }
}
layer{
  name: "pool1"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
  bottom: "eltwise1"
  top: "pool1"
}

layer{
  name: "conv2a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
  bottom: "pool1"
  top: "conv2a"
}
layer{
  name: "slice2a"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv2a"
  top: "slice2a_1"
  top: "slice2a_2"
}
layer{
  name: "etlwise2a"
  type: "Eltwise"
  bottom: "slice2a_1"
  bottom: "slice2a_2"
  top: "eltwise2a"
  eltwise_param {
    operation: MAX
  }
}

layer{
  name: "conv2"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 192
    kernel_size: 3
    stride: 1
    pad: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
  bottom: "eltwise2a"
  top: "conv2"
}



layer{
  name: "slice2"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv2"
  top: "slice2_1"
  top: "slice2_2"
}
layer{
  name: "etlwise2"
  type: "Eltwise"
  bottom: "slice2_1"
  bottom: "slice2_2"
  top: "eltwise2"
  eltwise_param {
    operation: MAX
  }
}
layer{
  name: "pool2"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
  bottom: "eltwise2"
  top: "pool2"
}

layer{
  name: "conv3a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 192
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
  bottom: "pool2"
  top: "conv3a"
}
layer{
  name: "slice3a"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv3a"
  top: "slice3a_1"
  top: "slice3a_2"
}
layer{
  name: "etlwise3a"
  type: "Eltwise"
  bottom: "slice3a_1"
  bottom: "slice3a_2"
  top: "eltwise3a"
  eltwise_param {
    operation: MAX
  }
}

layer{
  name: "conv3"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    kernel_size: 3
    stride: 1
    pad: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
  bottom: "eltwise3a"
  top: "conv3"
}


layer{
  name: "slice3"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv3"
  top: "slice3_1"
  top: "slice3_2"
}
layer{
  name: "etlwise3"
  type: "Eltwise"
  bottom: "slice3_1"
  bottom: "slice3_2"
  top: "eltwise3"
  eltwise_param {
    operation: MAX
  }
}
layer{
  name: "pool3"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
  bottom: "eltwise3"
  top: "pool3"
}

layer{
  name: "conv4a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 384
    kernel_size: 1
    stride: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "pool3"
  top: "conv4a"
}
layer{
  name: "slice4a"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv4a"
  top: "slice4a_1"
  top: "slice4a_2"
}
layer{
  name: "etlwise4a"
  type: "Eltwise"
  bottom: "slice4a_1"
  bottom: "slice4a_2"
  top: "eltwise4a"
  eltwise_param {
    operation: MAX
  }
}
layer{
  name: "conv4"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 256
    kernel_size: 3
    stride: 1
    pad: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "eltwise4a"
  top: "conv4"
}



layer{
  name: "slice4"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv4"
  top: "slice4_1"
  top: "slice4_2"
}
layer{
  name: "etlwise4"
  type: "Eltwise"
  bottom: "slice4_1"
  bottom: "slice4_2"
  top: "eltwise4"
  eltwise_param {
    operation: MAX
  }
}

layer{
  name: "conv5a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 256
    kernel_size: 1
    stride: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "eltwise4"
  top: "conv5a"
}
layer{
  name: "slice5a"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv5a"
  top: "slice5a_1"
  top: "slice5a_2"
}
layer{
  name: "etlwise5a"
  type: "Eltwise"
  bottom: "slice5a_1"
  bottom: "slice5a_2"
  top: "eltwise5a"
  eltwise_param {
    operation: MAX
  }
}
layer{
  name: "conv5"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 256
    kernel_size: 3
    stride: 1
    pad: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "eltwise5a"
  top: "conv5"
}


layer{
  name: "slice5"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "conv5"
  top: "slice5_1"
  top: "slice5_2"
}
layer{
  name: "etlwise5"
  type: "Eltwise"
  bottom: "slice5_1"
  bottom: "slice5_2"
  top: "eltwise5"
  eltwise_param {
    operation: MAX
  }
}

layer{
  name: "pool4"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
  bottom: "eltwise5"
  top: "pool4"
}

layer{
  name: "fc1"
  type: "InnerProduct"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 512
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }   
  }  
  bottom: "pool4"
  top: "fc1"
}
layer{
  name: "slice_fc1"
  type:"Slice"
  slice_param {
    slice_dim: 1
  }
  bottom: "fc1"
  top: "slice_fc1_1"
  top: "slice_fc1_2"
}
layer{
  name: "etlwise_fc1"
  type: "Eltwise"
  bottom: "slice_fc1_1"
  bottom: "slice_fc1_2"
  top: "eltwise_fc1"
  eltwise_param {
    operation: MAX
  }
}

layer{
  name: "drop1"
  type: "Dropout"
  dropout_param{
    dropout_ratio: 0.7
  }
  bottom: "eltwise_fc1"
  top: "eltwise_fc1"
}

layer{
  name: "fnc2"
  type: "InnerProduct"

  inner_product_param{
    num_output: 50
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }   
  }
  bottom: "eltwise_fc1"
  top: "fnc2"
}

layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fnc2"
  bottom: "label"
  top: "accuracy"
  include: { phase: TEST }
}

layer {
  name: "softmaxloss"
  type: "SoftmaxWithLoss"
  bottom: "fnc2"
  bottom: "label"
  top: "loss"
}