XNOR-Net-PyTorch测试
在CNN中有两个非常有名的二进制类的网络模型,一个叫做Binary-Weighted-Networks,一个叫做XNOR-Net。二元权重网络中,卷积核用两个值来近似表示,从而节省32倍的存储空间。在XNOR网络中,卷积核和卷积层输入都是用两个值(1和-1)表示的。 XNOR网络主要使用二元运算进行卷积运算。这使得卷积操作速度提高了58倍,节省了32倍的内存。 XNOR网络实现了在CPU(而不是GPU)上实时运行最先进网络的可能。
参考
代码来源于GitHub,项目地址;
实验结果
Data_set | 基础网络模型 | Test Accuracy | 模型大小 |
---|---|---|---|
MNIST | Lenet-5 | 99.00% | 1.64M |
CIFAR-10 | NIN | 86.12% | 3.72M |
实验过程
基于LeNet-5的XNOR-net在MNIST数据集上的训练结果
Namespace(arch='LeNet_5', batch_size=128, cuda=True, epochs=60, evaluate=False, log_interval=100, lr=0.01, lr_epochs=15, momentum=0.9, no_cuda=False, pretrained=None, seed=1, test_batch_size=128, weight_decay=1e-05)
LeNet_5(
(conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
(bn_conv1): BatchNorm2d(20, eps=0.0001, momentum=0.1, affine=False, track_running_stats=True)
(relu_conv1): ReLU(inplace)
(pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(bin_conv2): BinConv2d(
(bn): BatchNorm2d(20, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(conv): Conv2d(20, 50, kernel_size=(5, 5), stride=(1, 1))
(relu): ReLU(inplace)
)
(pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(bin_ip1): BinConv2d(
(bn): BatchNorm2d(50, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(linear): Linear(in_features=800, out_features=500, bias=True)
(relu): ReLU(inplace)
)
(ip2): Linear(in_features=500, out_features=10, bias=True)
)
Learning rate: 0.01
Train Epoch: 1 [0/60000 (0%)] Loss: 2.301339
Train Epoch: 1 [12800/60000 (21%)] Loss: 0.206438
Train Epoch: 1 [25600/60000 (43%)] Loss: 0.139996
Train Epoch: 1 [38400/60000 (64%)] Loss: 0.053089
Train Epoch: 1 [51200/60000 (85%)] Loss: 0.363078
main.py:64: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
data, target = Variable(data, volatile=True), Variable(target)
==> Saving model ...
Test set: Average loss: 0.1444, Accuracy: 9593/10000 (95.00%)
Best Accuracy: 95.00%
Learning rate: 0.01
Train Epoch: 2 [0/60000 (0%)] Loss: 0.290992
Train Epoch: 2 [12800/60000 (21%)] Loss: 0.066965
Train Epoch: 2 [25600/60000 (43%)] Loss: 0.203039
Train Epoch: 2 [38400/60000 (64%)] Loss: 0.200744
Train Epoch: 2 [51200/60000 (85%)] Loss: 0.206140
==> Saving model ...
Test set: Average loss: 0.1496, Accuracy: 9637/10000 (96.00%)
Best Accuracy: 96.00%
Learning rate: 0.01
Train Epoch: 3 [0/60000 (0%)] Loss: 0.170542
Train Epoch: 3 [12800/60000 (21%)] Loss: 0.083674
Train Epoch: 3 [25600/60000 (43%)] Loss: 0.219202
Train Epoch: 3 [38400/60000 (64%)] Loss: 0.138911
Train Epoch: 3 [51200/60000 (85%)] Loss: 0.224724
...
Learning rate: 1.0000000000000002e-06
Train Epoch: 60 [0/60000 (0%)] Loss: 0.006104
Train Epoch: 60 [12800/60000 (21%)] Loss: 0.002640
Train Epoch: 60 [25600/60000 (43%)] Loss: 0.000195
Train Epoch: 60 [38400/60000 (64%)] Loss: 0.004205
Train Epoch: 60 [51200/60000 (85%)] Loss: 0.000068
Test set: Average loss: 0.0312, Accuracy: 9920/10000 (99.00%)
Best Accuracy: 99.00%
基于NIN的XNOR-net在MNIST数据集上的训练结果
==> Options: Namespace(arch='nin', cpu=False, data='./data/', evaluate=False, lr='0.01', pretrained=None)
==> building model nin ...
==> Initializing model parameters ...
DataParallel(
(module): Net(
(xnor): Sequential(
(0): Conv2d(3, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): BatchNorm2d(192, eps=0.0001, momentum=0.1, affine=False, track_running_stats=True)
(2): ReLU(inplace)
(3): BinConv2d(
(bn): BatchNorm2d(192, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(conv): Conv2d(192, 160, kernel_size=(1, 1), stride=(1, 1))
(relu): ReLU(inplace)
)
(4): BinConv2d(
(bn): BatchNorm2d(160, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(conv): Conv2d(160, 96, kernel_size=(1, 1), stride=(1, 1))
(relu): ReLU(inplace)
)
(5): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(6): BinConv2d(
(bn): BatchNorm2d(96, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout(p=0.5)
(conv): Conv2d(96, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(relu): ReLU(inplace)
)
(7): BinConv2d(
(bn): BatchNorm2d(192, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
(relu): ReLU(inplace)
)
(8): BinConv2d(
(bn): BatchNorm2d(192, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
(relu): ReLU(inplace)
)
(9): AvgPool2d(kernel_size=3, stride=2, padding=1)
(10): BinConv2d(
(bn): BatchNorm2d(192, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout(p=0.5)
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu): ReLU(inplace)
)
(11): BinConv2d(
(bn): BatchNorm2d(192, eps=0.0001, momentum=0.1, affine=True, track_running_stats=True)
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
(relu): ReLU(inplace)
)
(12): BatchNorm2d(192, eps=0.0001, momentum=0.1, affine=False, track_running_stats=True)
(13): Conv2d(192, 10, kernel_size=(1, 1), stride=(1, 1))
(14): ReLU(inplace)
(15): AvgPool2d(kernel_size=8, stride=1, padding=0)
)
)
)
Train Epoch: 1 [0/50000 (0%)] Loss: 2.303518 LR: 0.01
Train Epoch: 1 [12800/50000 (26%)] Loss: 1.583676 LR: 0.01
Train Epoch: 1 [25600/50000 (51%)] Loss: 1.252695 LR: 0.01
Train Epoch: 1 [38400/50000 (77%)] Loss: 1.277936 LR: 0.01
==> Saving model ...
Test set: Average loss: 1.8486, Accuracy: 5068/10000 (50.68%)
Best Accuracy: 50.68%
Train Epoch: 2 [0/50000 (0%)] Loss: 1.305717 LR: 0.01
Train Epoch: 2 [12800/50000 (26%)] Loss: 1.235312 LR: 0.01
Train Epoch: 2 [25600/50000 (51%)] Loss: 1.342946 LR: 0.01
Train Epoch: 2 [38400/50000 (77%)] Loss: 1.303735 LR: 0.01
==> Saving model ...
Test set: Average loss: 1.5400, Accuracy: 5794/10000 (57.94%)
Best Accuracy: 57.94%
Train Epoch: 3 [0/50000 (0%)] Loss: 1.030597 LR: 0.01
Train Epoch: 3 [12800/50000 (26%)] Loss: 1.003013 LR: 0.01
Train Epoch: 3 [25600/50000 (51%)] Loss: 0.989754 LR: 0.01
Train Epoch: 3 [38400/50000 (77%)] Loss: 0.973633 LR: 0.01
==> Saving model ...
Test set: Average loss: 1.2972, Accuracy: 6417/10000 (64.17%)
Best Accuracy: 64.17%
Train Epoch: 4 [0/50000 (0%)] Loss: 0.908937 LR: 0.01
Train Epoch: 4 [12800/50000 (26%)] Loss: 0.852426 LR: 0.01
Train Epoch: 4 [25600/50000 (51%)] Loss: 0.964585 LR: 0.01
Train Epoch: 4 [38400/50000 (77%)] Loss: 0.857294 LR: 0.01
==> Saving model ...
Test set: Average loss: 1.2501, Accuracy: 6572/10000 (65.72%)
Best Accuracy: 65.72%
Train Epoch: 5 [0/50000 (0%)] Loss: 0.903293 LR: 0.01
Train Epoch: 5 [12800/50000 (26%)] Loss: 0.766061 LR: 0.01
Train Epoch: 5 [25600/50000 (51%)] Loss: 0.913492 LR: 0.01
Train Epoch: 5 [38400/50000 (77%)] Loss: 0.775575 LR: 0.01
Test set: Average loss: 1.5153, Accuracy: 5954/10000 (59.54%)
Best Accuracy: 65.72%
...
Train Epoch: 320 [0/50000 (0%)] Loss: 0.458021 LR: 0.01
Train Epoch: 320 [12800/50000 (26%)] Loss: 0.467194 LR: 0.01
Train Epoch: 320 [25600/50000 (51%)] Loss: 0.353698 LR: 0.01
Train Epoch: 320[38400/50000 (77%)] Loss: 0.517535 LR: 0.01
Test set: Average loss: 0.8367, Accuracy: 8612/10000 (86.12%)
Best Accuracy: 86.12%