blazeface学习笔记

zhqh100

已于 2024-01-18 10:29:58 修改

阅读量3k

点赞数 4

分类专栏：计算机视觉 python 文章标签：计算机视觉深度学习 pytorch

于 2022-03-23 19:16:36 首次发布

本文链接：https://blog.csdn.net/zhqh100/article/details/123688945

版权

python 同时被 2 个专栏收录

51 篇文章

订阅专栏

计算机视觉

33 篇文章

订阅专栏

完整的应该是一个人脸识别项目,人脸识别,大言不惭的说,我之前其实也做过,比如用dlib来做人脸识别,就是用opencv那一套来实现,说句实在话,速度非常慢,即便是在intel CPU上,一秒也就两三帧,确实是太慢了

我其实也用过其他方案,比如前几年,下载虹软的免费的库,进行试用,效果确实惊人,给我印象最深刻的,倒不是识别准确度有多高,而是速度真的飞快,

我也试过MTCNN,这个只要网上搜索人脸检测,基本都是搜到这个结果,我也尝试过,我不知道别人是如何夸奖这个库的,我试用的体会就是,经常误识别,就是本来就不是个人脸,却非要识别成一个人脸,通常认为,可以提高阈值,进行过滤,可是我的体会是,有些明明不是人脸的地方,其confidence却非常高,而有些明明是人脸的地方,却又漏检,我不知道为啥别人还到处推荐.

人脸检测,用我这外行的话,应该是分为四个步骤

1.人脸检测，不仅要检测出人脸，而且要检测出关键点，

2.根据检测到的关键点进行人脸对齐，也就是alignment，也就是仿射变换，那是不是可以不对齐呢？按说也可以，但研究发现，人脸对齐后，人脸识别的难度就大大降低了

3.把对齐后的人脸送到神经网络中去计算，得到一个128维的向量

4.把这个向量，跟其他人脸的向量进行对比，如果比较接近，那就认为是同一个人

除了上面提到的几个库，我还用过商汤的人脸识别库，在imx6(Cortex-A9)的芯片上，其检测速度可以达到12FPS，还是相当出色的，不过我记得其加载速度比较慢，加载可能要半分钟，不过好在只需启动的时候加载一次。

当然，无论是商汤的库，还是虹软的库，都是闭源的，而且还需要Licence，如果Licence过期就无法使用了；

目前开源的人脸识别库，比较优秀的，有arcface，跟虹软重名，比较可靠的实现，可以参考

https://github.com/deepinsight/insightface/tree/master/recognition/arcface_paddle

或者

https://github.com/onnx/models/tree/main/vision/body_analysis/arcface

当然我也写了一点点笔记，

基于onnx的人脸识别_zhqh100的博客-CSDN博客

下面说回人脸检测，之前在YouTube上看到过Google宣传的FaceMesh的功能，印象深刻

不仅速度很快，而且结果稳定，只不过其没有开放训练的源码，我在网上搜到了一个Pytorch版本的实现，

https://github.com/zmurez/MediaPipePyTorch.git

这个是一个演示，复现了上面的功能，以及速度，我看单帧平均速度为22ms，在Intel I7上。

不过可惜该项目中没有训练代码，只有推理demo代码。

然后我又找到了一个项目，是

https://github.com/zineos/blazeface.git

其中包含推理代码和训练代码，

该工程是从另一个工程改过来的，拿到手有点迷茫，就是直接运行会报错，我fork之后稍微改了一下，我的工程是

https://github.com/moneypi/blazeface

还有一点是这个训练数据，在百度网盘上的链接失效了，我就重新从Google上下载下来，重新传了一份，

链接：https://pan.baidu.com/s/1ao-XwW6i8VXsSvh_fSelgQ?pwd=u7ii

提取码：u7ii

--来自百度网盘超级会员V1的分享

打印出的网络结构如下：

Blaze(
  (conv1): Sequential(
    (0): Conv2d(3, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
  )
  (conv2): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=24, bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (conv3): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=24, bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (conv4): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=24, bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (shortcut): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): Conv2d(24, 48, kernel_size=(1, 1), stride=(1, 1))
      (2): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): ReLU(inplace=True)
    )
  )
  (conv5): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(48, 48, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=48, bias=False)
        (1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (conv6): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(48, 48, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=48, bias=False)
        (1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (conv7): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(48, 48, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=48, bias=False)
        (1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(48, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ReLU(inplace=True)
      (2): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): ReLU(inplace=True)
    )
    (shortcut): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): Conv2d(48, 96, kernel_size=(1, 1), stride=(1, 1))
      (2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): ReLU(inplace=True)
    )
  )
  (conv8): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)
        (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ReLU(inplace=True)
      (2): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): ReLU(inplace=True)
    )
  )
  (conv9): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)
        (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ReLU(inplace=True)
      (2): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): ReLU(inplace=True)
    )
  )
  (conv10): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(96, 96, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=96, bias=False)
        (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ReLU(inplace=True)
      (2): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): ReLU(inplace=True)
    )
    (shortcut): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1))
      (2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): ReLU(inplace=True)
    )
  )
  (conv11): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)
        (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ReLU(inplace=True)
      (2): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): ReLU(inplace=True)
    )
  )
  (conv12): BlazeBlock(
    (actvation): ReLU(inplace=True)
    (conv): Sequential(
      (0): Sequential(
        (0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)
        (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ReLU(inplace=True)
      (2): Sequential(
        (0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
        (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): ReLU(inplace=True)
    )
  )
  (loc): Sequential(
    (0): Sequential(
      (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
      (1): ReLU(inplace=True)
      (2): Conv2d(96, 8, kernel_size=(1, 1), stride=(1, 1))
    )
    (1): Sequential(
      (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
      (1): ReLU(inplace=True)
      (2): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1))
    )
  )
  (conf): Sequential(
    (0): Sequential(
      (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
      (1): ReLU(inplace=True)
      (2): Conv2d(96, 4, kernel_size=(1, 1), stride=(1, 1))
    )
    (1): Sequential(
      (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
      (1): ReLU(inplace=True)
      (2): Conv2d(96, 12, kernel_size=(1, 1), stride=(1, 1))
    )
  )
  (landm): Sequential(
    (0): Sequential(
      (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
      (1): ReLU(inplace=True)
      (2): Conv2d(96, 20, kernel_size=(1, 1), stride=(1, 1))
    )
    (1): Sequential(
      (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
      (1): ReLU(inplace=True)
      (2): Conv2d(96, 60, kernel_size=(1, 1), stride=(1, 1))
    )
  )
)

inputs的shape为[1, 3, 320, 320]，其中backbone有四次下采样，会把第四次下采样前和网络的最后结果保存到detections中，分别为[1, 96, 40, 40]和[1, 96, 20, 20]

这两个结果会分别送到loc_layers、conf_layers和landm_layers中去计算，得出的尺寸为

loc[0]的shape是[1, 40, 40, 8]，loc[1] 的shape是 [1, 20, 20, 24]，加一块，再分成四个坐标，就是bbox_regressions.shape=[1, 5600, 4]

conf[0].shape=[1, 40, 40, 4]

conf[1].shape=[1, 20, 20, 12]

加一块得到classifications.shape=[1, 5600, 2]

landm[0].shape=torch.Size([1, 40, 40, 20])
landm[1].shape=torch.Size([1, 20, 20, 60])

加一块得到 ldm_regressions.shape=torch.Size([1, 5600, 10])

回过头来看anchors，配置文件中'steps': [8, 16],而图像尺寸为320*320，那么就得出小框为320/8，即40*40个，大框为320/16，即20*20个，

然后'min_sizes': [[8, 11], [14, 19, 26, 38, 64, 149]],，即40*40的框，分别会生成尺寸为8和11的anchor，而20*20的框，就遍历[14, 19, 26, 38, 64, 149]，也就是总共会生成40 * 40 * 2 + 20 * 20 *6=3200+2400=5600个框

所以，综上，基本证实了，blazeface，就是一个小号的ssd，其没有FPN的融合，然后，也是基于anchor的实现，所以，如果用anchor-free的方法，网络还有进一步减小的空间?

损失函数，landmark 和 location 都是使用了 smooth_l1_loss，

loss_c，分类的损失函数为focalloss

按说给各个损失都添加了权重的比重系数，不过我看其代码中只用到了分类的系数，也就是 loss_c会再乘以6，然后各个损失相加，就是总损失

计算各损失的时候，都是采样一部分进行计算

我基本没有做修改(其实修改了batchsize,原工程默认值为256, 我修改为了64,显卡为GeForce RTX 2060, 6G,没那么阔),评估精度为:

==================== Results ====================
Easy   Val AP: 0.8039773948120692
Medium Val AP: 0.7454040908046184
Hard   Val AP: 0.4292227434416538
=================================================

用工程中自带的模型参数测试,

==================== Results ====================
Easy   Val AP: 0.7667513659507036
Medium Val AP: 0.6894514595432863
Hard   Val AP: 0.34819400614673535
=================================================

我靠,我没优化,我训练的结果居然比他自带的精度还高一些,Surprise

训练时间是从2022-03-23 22:43到2022-03-24 04:55,大概6个小时

因为看到训练的loss曲线波动很大,所以尝试降低lr,降了一半,看起来loss曲线确实好了一点,但精度却降低了

==================== Results ====================
Easy   Val AP: 0.7789928144856131
Medium Val AP: 0.7163551912058919
Hard   Val AP: 0.3995628884327997
=================================================

所以,暂时还是用原来的学习率,做其他尝试

我尝试替换激活函数,看是否能提高准确率,我用nn.Softplus()来替换了原来所有的 nn.ReLU(inplace=True),意识到了一个问题,Softplus这个激活函数没有inplace这个参数,那么需要的显存空间会更大,我原本6G的显存,在Relu的时候,'batch_size': 64,基本跑满,而替换为Softplus之后,我只好设置为'batch_size': 40,不过好消息是,精度确实提高了,这是目前为止,最高的一个点

==================== Results ====================
Easy   Val AP: 0.8155068518500983
Medium Val AP: 0.7586505701251691
Hard   Val AP: 0.4390300734203851
=================================================

执行测试,运行的是

python test_widerface.py

获取上面结果,运行的是

cd widerface_evaluate/
python evaluation.py

下面学习权重量化,先尝试了一下 quantize_dynamic,发现精度和量化前完全一致,所以查一下到底是哪里的问题,从网上别人的教程,顺便跳转到 quantize_dynamic 的源码看了一下,其默认只会对如下模块进行量化:

nn.Linear, nn.LSTM, nn.GRU, nn.LSTMCell, nn.RNNCell, nn.GRUCell

那我们这里基本没有用到上面的模块,所以也基本没啥变化.

然后,我尝试做了一下模型量化,但水平不行,没有实现,参考

一次失败的Pytorch模型量化尝试_zhqh100的博客-CSDN博客我的原工程模型是blazeface学习笔记_zhqh100的博客-CSDN博客完整的应该是一个人脸识别项目,人脸识别,大言不惭的说,我之前其实也做过,比如用dlib来做人脸识别,就是用opencv那一套来实现,说句实在话,速度非常慢,即便是在intel CPU上,一秒也就两三帧,确实是太慢了我其实也用过其他方案,比如前几年,下载虹软的免费的库,进行试用,效果确实惊人,给我印象最深刻的,倒不是识别准确度有多高,而是速度真的飞快,我也试过MTCNN,这个只要网上搜索人脸检测,基本都是搜到这个结果,我也尝试过,我https://blog.csdn.net/zhqh100/article/details/123742045

昨天试了一下加FPN,当然这里就是修改模型了,而不是复现论文,我在模型里加了如下几行

fpn_bak = F.interpolate(detections[1], size=(detections[0].size(2), detections[0].size(3)))
# detections[0] = torch.cat((detections[0], fpn_bak), dim=1)
detections[0] = torch.add(detections[0], fpn_bak)

当然用cat可能也行,但是用cat的话,就需要增加參數量,我尽量还是不增加參數量,训练完成的精度为:

==================== Results ====================
Easy   Val AP: 0.8127231180335884
Medium Val AP: 0.7547545989153277
Hard   Val AP: 0.43340084728149164
=================================================

说一下我这里的配置,我目前的batchsize是40,初始学习率是1e-3,激活函数在试量化的时候改回了ReLU,

因为之前使用Softplus的时候,精度提高了很多,所以我再次替换类ReLU激活函数,看是否能提高精度,我在网上看到一篇文章,

激活函数 | Squareplus性能比肩Softplus激活函数速度快6倍（附Pytorch实现）-技术圈本文提出了Squareplus激活函数，这是一个类似softplus的激活函数，但只需要通过简单的代数运算来实现：加法、乘法和平方根。由于Squarhttps://jishuin.proginn.com/p/763bfbd704d4当然我也不知道他是不是原创,他里面贴了一个pytorch实现的squareplus,我用了之后发现他是一个坑,他代码写错了,少写一个平方,所以我这里贴一下我改之后的实现

class Squareplus(nn.Module):
    def __init__(self, b=0.2):
        super(Squareplus, self).__init__()
        self.b = b

    def forward(self, x):
        x = 0.5 * (x + torch.sqrt(torch.square(x)+self.b))
        return x

然后精度还是有提高的

==================== Results ====================
Easy   Val AP: 0.8200590574413846
Medium Val AP: 0.7649220162383361
Hard   Val AP: 0.4490872484483122
=================================================

这个精度已经超过了上面的最高精度

稍微说一下,就是我这里用自己实现的Squareplus函数,训练显存又大了一点,所以batchsize又从40改为了30,然后推理速度的话,用ReLU的时候,推理速度大概是不到4ms,当然这里不说后处理,后处理也要2.4ms左右,然后用Squareplus的时候,推理速度大概算6ms了,所以慢很多

我总是感觉ReLU的激活函数太过粗暴,我希望能找到一个平滑的,能参数自适应的激活函数,看是否能得出更好的效果,而squareplus就是一个单调递增的,平滑的函数,我修改为了自适应版本,如下:

class Squareplus(nn.Module):
    def __init__(self):
        super(Squareplus, self).__init__()
        self.b = torch.nn.Parameter(torch.tensor([1., 1., 1e-8]))

    def forward(self, x):
        x = 0.5 * (self.b[0] * x + self.b[1] * torch.sqrt(torch.square(x)+torch.square(self.b[2])))
        return x

因为squareplus的公式为

$\begin{equation}\text{SquarePlus}(x)=\frac{x+\sqrt{x^2+b}}{2}\end{equation}$

那如果b=0的时候,其实SquarePlus就跟ReLU完全相等,我最开始的时候,是想上面代码中直接用self.b[2] = 0.应该也能训练,但实际情况是如果self.b[2]=0.的时候,loss直接就NaN,inf了,所以我给了一个接近0的值,那实际效果来看,精度确实提升了一点点:

==================== Results ====================
Easy   Val AP: 0.8227621694850481
Medium Val AP: 0.7662798102976387
Hard   Val AP: 0.4513160808024386
=================================================

这个比上面所有的精度都高,但也不算惊人吧,但比较显著的是,推理速度慢了很多,升到了差不多8ms,

我预测时候,把所有的self.b打印出来了,如下:

[0.8454, 0.3644, 0.1367]
[ 0.8873,  0.7530, -0.1533]
[0.7036, 0.6800, 0.2724]
[ 0.4455,  1.2127, -0.0125]
[0.9210, 0.5843, 0.1486]
[0.9819, 0.7223, 0.4142]
[ 1.0522,  0.6184, -0.2266]
[ 0.8519,  0.4931, -0.3491]
[0.8119, 0.8772, 0.6076]
[ 0.8358,  0.4341, -0.3027]
[ 0.7915,  0.9259, -0.6816]
[ 0.6944,  0.8840, -0.6110]
[0.9455, 0.5593, 0.4089]
[0.8080, 0.8872, 0.4731]
[0.7749, 0.7353, 0.7412]
[ 1.0064,  0.6360, -0.3702]
[1.3030, 0.3886, 0.2978]
[1.2454, 0.5487, 0.1881]
[0.6192, 1.0405, 0.6971]
[0.7321, 0.9983, 0.5140]
[0.6235, 1.0673, 0.8745]
[1.1215, 0.7250, 0.2422]
[ 1.2247,  0.7770, -0.0237]
[0.5734, 1.0426, 0.8500]
[0.8035, 1.0243, 0.4659]
[ 0.5823,  1.1394, -1.0904]
[ 1.8497,  0.8324, -0.1130]
[ 1.9515,  1.0291, -0.0616]
[ 0.5608,  1.0035, -0.5520]
[ 0.7945,  0.8310, -0.3079]
[0.7418, 0.9905, 0.5818]
[ 0.9830,  0.4994, -0.2175]
[1.0192, 0.6735, 0.4724]
[ 0.8990,  0.6039, -0.0406]
[ 0.5347,  1.1900, -0.3618]
[0.8685, 0.9823, 0.1829]
[0.7512, 0.9619, 0.5523]
[ 1.1577,  0.7502, -0.2851]
[0.9508, 0.7332, 0.0213]
[0.6115, 1.0775, 0.5334]
[0.8370, 1.0327, 0.4429]
[0.8130, 1.0963, 0.8485]
[ 1.8354,  0.8548, -0.1749]
[ 1.4897,  0.7104, -0.0795]
[1.1626, 1.1678, 0.0058]
[ 0.9293,  0.8205, -0.1558]
[1.2170, 1.4804, 0.0718]
[ 1.7995,  0.9660, -0.0163]
[ 1.3056,  1.1443, -0.0841]
[1.6489, 1.1632, 0.1204])

感觉还蛮有意思的