本部分的实验主要记录调整网络的过程,并记录实验结果。——Jeremy
模型1
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
norm1 | LRN |
ip1 | 200 |
ip2 | 10 |
实验结果:
- Iteration 60000, loss = 0.801631
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.5287
Test net output #1: loss = 1.47174 (* 1 = 1.47174 loss)
从实验结果可以知道,当前模型的识别效果较差,只有 52.87% 的识别率,同时比较Train过程中的loss和测试过程中的loss,可以知道,二者相差较大,出现过拟合现象。。
下一个模型中,我们增加一个FC层,加大网络可学习的参数个数。
模型2:(增加一个FC层)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
norm1 | LRN |
ip1 | 200 |
ip2 | 100 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.609769
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.5467
Test net output #1: loss = 1.40912 (* 1 = 1.40912 loss)
此结果与模型1相比,其识别效果有所提升,从52.87% 提高到 54.67% 。训练阶段的loss和测试阶段的loss都有所降低。因此,下一步我们再增加2个FC层的输出个数,来看看识别效果是否会进一步提升。
模型3:(增加2个FC层的的输出个数)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
norm1 | LRN |
ip1 | 400 |
ip2 | 200 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.603354
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy =0.5423
Test net output #1: loss = 1.45162
从模型3的实验结果中,我们可以发现,其识别效果并没有得到提升,可见,单独地增加FC层的输出个数并不是一个很好的方法。
因此,我在下一步中,考虑卷积层对识别结果的影响。在模型3的基础上增加一个卷积层。
模型4:(增加一个卷积层)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
ip1 | 400 |
ip2 | 200 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.690654
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.5815
Test net output #1: loss = 1.24061
从模型4的实验结果中,我们可以发现,增加了一个卷积层后,训练阶段的loss相比于模型3,提高了0.0873.而测试阶段的loss降低了0.21101.总的来说,这是一个好的现象,它在一定程度上降低了过拟合。
在下一个模型中,我们在第2个卷积层后面添加一个MaxPooling,同时降低FC层的输出。
模型5:(增加一个max poooling)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
ip1 | 200 |
ip2 | 100 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.73147
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.6335
Test net output #1: loss = 1.06225
增加pool层后,训练阶段和测试阶段的loss的差别,进一步变小,过拟合有所降低,不过识别效果还是不佳,因此,在下一步中我考虑在pool层后面增加一个Sigmoid层。
模型6:(增加一个Sigmoid层)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
ip1 | 200 |
ip2 | 100 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 2.29934
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.1
Test net output #1: loss = 2.30292
可见加了sigmoid层的方法在此模型中无效。
模型7:(换成ReLU层)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | Sigmoid |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
ip1 | 200 |
ip2 | 100 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.620338
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.6391
Test net output #1: loss = 1.05354
模型8:(全部换成ReLU层)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
ip1 | 200 |
ip2 | 100 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.416507
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.6794
Test net output #1: loss = 1.15119
在两个卷积层后面加上ReLU层后,识别效果提升了较多,识别率为67.94%。
模型9:(加一个Dropout层)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
ip1 | 200 |
dropout | dropout, 0.5 |
ip2 | 100 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.563472
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.6728
Test net output #1: loss = 1.03333
从实验结果可以知道,加了Dropout层之后,虽然没有提高识别效果,但是降低了过拟合。因此,下一步增加FC层的输出看看。
模型10:(增加FC层的输出个数)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
ip1 | 400 |
dropout | dropout, 0.5 |
ip2 | 150 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.446714
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.6903
Test net output #1: loss = 0.990431
模型11:(再增加一个Dropout)
层类别 | 具体信息 |
---|
conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
norm1 | LRN |
conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
ip1 | 400 |
dropout | dropout, 0.5 |
ip2 | 200 |
dropout | dropout, 0.5 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.586936
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.7013
Test net output #1: loss = 0.92605
模型12:(调整卷积层的输出)
层类别 | 具体信息 |
---|
conv1 | output: 48, kernel: 5, stride: 1 pad: 2 |
pool1 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
norm1 | LRN |
conv2 | output: 32, kernel: 5, stride: 1 pad: 2 |
pool2 | pool: MAX, kernel: 3, stride: 2 |
relu1 | ReLU |
ip1 | 400 |
dropout | dropout, 0.5 |
ip2 | 200 |
dropout | dropout, 0.5 |
ip3 | 10 |
实验结果:
- Iteration 60000, loss = 0.273988
- Iteration 60000, Testing net (#0)
Test net output #0: accuracy = 0.7088
Test net output #1: loss = 1.1117
本文地址:http://blog.csdn.net/linj_m/article/details/49428601