提出了一种对网络模型的三元素量化的策略,即(,0,),与其他的方法相比,这个方法的,不是固定的,并且绝对值不一定相等。于是:,也是模型两个训练参数之一。于是网络的训练模型也需要对这两个参数进行训练。于是作者对后向传播函数也进行了修改。作者对模型的压缩效果进行了分析。分析的结果如下:
Tabel 1:Error rates of full-precision and ternary ResNets on Cifar-10
Mode | Full resolution | Tenary(Ours) | Improvement |
ResNet-20 | 8.23 | 8.87 | -0.64 |
ResNet-32 | 7.67 | 7.63 | 0.04 |
ResNet-44 | 7.18 | 7.02 | 0.16 |
ResNet-56 | 6.80 | 6.44 | 0.36 |
Table 2: Top1 and Top5 error rate of AlexNet on ImageNet
Error | Full precision | 1-bit (DoReFa | 2-bit (TWN) | 2-bit (Ours) |
Top1 | 42.8% | 46.1% | 45.5% | 42.5% |
Top5 | 19.7% | 23.7% | 23.2% | 20.3% |
Table 3: Top1 and Top5 error rate of ResNet-18 on ImageNet
Error | Full precision | 1-bit (DoReFa | 2-bit (TWN) | 2-bit (Ours) |
Top1 | 30.4% | 39.2% | 34.7% | 33.4% |
Top5 | 10.8% | 17.0% | 13.8% | 12.8% |
算法中的关键步骤如下:
量化策略:
后向传播函数:
作者给出了Alexnet的详细模型以及密度情况如下:
Table 4: Alexnet layer-wise sparsity
Layer | Full precision Density | Full precision Width | Pruing(NIPS’ 15) Density | Pruing(NIPS’ 15) Width | Ours Density | Ours Wisth |
conv1 | 100% | 32 bit | 84% | 8 bit | 100% | 32bit |
conv2 | 100% | 32 bit | 38% | 8 bit | 23% | 2 bit |
conv3 | 100% | 32 bit | 35% | 8 bit | 24% | 2 bit |
conv4 | 100% | 32 bit | 37% | 8 bit | 40% | 2 bit |
conv5 | 100% | 32 bit | 37% | 8 bit | 43% | 2 bit |
conv toal | 100% | - | 37% | 33% | ||
fc1 | 100% | 32 bit | 9% | 5 bit | 30% | 2 bit |
fc2 | 100% | 32 bit | 9% | 5 bit | 36% | 2 bit |
fc3 | 100% | 32 bit | 25% | 5 bit | 100% | 32bit |
fc total | 100% | - | 10% | 37% | ||
All total | 100% | 11% | 37% |