深度学习之基础模型-SEP-Nets

最新推荐文章于 2023-05-17 09:30:53 发布

leo_whz

最新推荐文章于 2023-05-17 09:30:53 发布

阅读量1.1k

点赞数

分类专栏： base_model 文章标签：深度学习

本文链接：https://blog.csdn.net/whz1861/article/details/78546312

版权

本文提出了一种名为SEP-Nets的深度学习模型压缩方法，通过二值化kxk（k>1）卷积而不处理1x1卷积，以及引入Pattern Residual Block，减少了模型参数量。实验表明，在保持性能的同时，模型参数显著减少。

摘要由CSDN通过智能技术生成

Albeit there are already intensive studies on compressing the size of CNNs, the considerable drop of performance is still a key concern in many designs. This paper addresses this concern with several new contributions.

First, we propose a simple ye powerful method for compressing the size of deep CNNs based on parameter binarization. The striking difference from most previous work on parameter binarization/quantization lies at different treatments of 1x1 convolutions and kxk convolutions(K>1), where we only binarize k x k convolutions into binary patterns.

Second, in light of the different functionalities of 1x1 (data projection/transformation) and kxk convolutions(pattern extraction), we propose a new Blok structure codenamed the pattern residual block that adds transformed feature maps generated by kxk convolutions, based on which we design a small network with ~1 million parameters.

思想

深度学习有两个问题：

创新网络结构，不断提高网络的表达能力【关注模型的表达效果】
在相同表达能力的情况下，不断缩小模型大小【关注模型参数的多少】

本文关注的是第二个问题，怎么保证模型表达效果的前提下，不断缩小模型的参数量。在此基础上，作者也提出了一个新的网络结构。

目前模型压缩的方法有：

quantization
binarization
sharing
pruning
hashing
Huffman coding

论文考虑了减少模型参数的方法：

binarizatin【二值化】：但论文只考虑kxk（k>1）的卷积核的二值化，对1x1卷积核不做处理

- 相对于3x3，5x5的卷积，1x1的卷积对网络结构的作用更加重要

作者提出了一种新的卷积模块，包含几层卷积操作和残差连接

模型结构

Pattern Binarization
- 说明：
  - step 1: 正常训练一个卷积神经网络，比如GoogLeNet
  - step 2: 将kxk(k>1)的卷积进行二值化
  - step 3: 给二值化的kxk卷积增加一个尺度因子 $\alpha$ ，然后fine-tune这个尺度因子 $\alpha$ 和1x1的卷积【该1x1的卷积是float数据类型】，即，优化下面问题： $min α \in R, B \in {1$

最低0.47元/天解锁文章

leo_whz

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
深度学习之基础模型-SEP-Nets

Albeit there are already intensive studies on compressing the size of CNNs, the considerable drop of performance is still a key concern in many designs. This paper addresses this concern with several n
复制链接

扫一扫

专栏目录