今天 自动调参程序跑完了,得到了所有的results,现将数据分析方法及结论记录:
原始数据如下:
lr = 0.005 decay = 0.99 momentum = 0 is_reg = 1 stop@Epoch 18, train_accuracy = 0.969 val_acc = 0.835 test_acc = 0.822
lr = 0.005 decay = 0.99 momentum = 0 is_reg = 0 stop@Epoch 28, train_accuracy = 0.981 val_acc = 0.826 test_acc = 0.825
lr = 0.005 decay = 0.99 momentum = 0.1 is_reg = 1 stop@Epoch 17, train_accuracy = 0.965 val_acc = 0.814 test_acc = 0.81
lr = 0.005 decay = 0.99 momentum = 0.1 is_reg = 0 stop@Epoch 17, train_accuracy = 0.965 val_acc = 0.829 test_acc = 0.821
lr = 0.005 decay = 0.99 momentum = 0.2 is_reg = 1 stop@Epoch 12, train_accuracy = 0.945 val_acc = 0.813 test_acc = 0.809
lr = 0.005 decay = 0.99 momentum = 0.2 is_reg = 0 stop@Epoch 15, train_accuracy = 0.96 val_acc = 0.818 test_acc = 0.812
lr = 0.005 decay = 0.9 momentum = 0 is_reg = 1 stop@Epoch 11, train_accuracy = 0.944 val_acc = 0.805 test_acc = 0.791
lr = 0.005 decay = 0.9 momentum = 0 is_reg = 0 stop@Epoch 15, train_accuracy = 0.962 val_acc = 0.817 test_acc = 0.79
lr = 0.005 decay = 0.9 momentum = 0.1 is_reg = 1 stop@Epoch 13, train_accuracy = 0.954 val_acc = 0.807 test_acc = 0.805
lr = 0.005 decay = 0.9 momentum = 0.1 is_reg = 0 stop@Epoch 13, train_accuracy = 0.951 val_acc = 0.819 test_acc = 0.808
lr = 0.005 decay = 0.9 momentum = 0.2 is_reg = 1 stop@Epoch 18, train_accuracy = 0.967 val_acc = 0.796 test_acc = 0.819
lr = 0.005 decay = 0.9 momentum = 0.2 is_reg = 0 stop@Epoch 20, train_accuracy = 0.97 val_acc = 0.82 test_acc = 0.817
lr = 0.005 decay = 0.8 momentum = 0 is_reg = 1 stop@Epoch 13, train_accuracy = 0.949 val_acc = 0.82 test_acc = 0.811
lr = 0.005 decay = 0.8 momentum = 0 is_reg = 0 stop@Epoch 15, train_accuracy = 0.957 val_acc = 0.805 test_acc = 0.807
lr = 0.005 decay = 0.8 momentum = 0.1 is_reg = 1 stop@Epoch 12, train_accuracy = 0.944 val_acc = 0.811 test_acc = 0.795
lr = 0.005 decay = 0.8 momentum = 0.1 is_reg = 0 stop@Epoch 16, train_accuracy = 0.962 val_acc = 0.825 test_acc = 0.815
lr = 0.005 decay = 0.8 momentum = 0.2 is_reg = 1 stop@Epoch 15, train_accuracy = 0.954 val_acc = 0.827 test_acc = 0.815
lr = 0.005 decay = 0.8 momentum = 0.2 is_reg = 0 stop@Epoch 12, train_accuracy = 0.941 val_acc = 0.798 test_acc = 0.803
lr = 0.001 decay = 0.99 momentum = 0 is_reg = 1 stop@Epoch 19, train_accuracy = 0.981 val_acc = 0.826 test_acc = 0.833
lr = 0.001 decay = 0.99 momentum = 0 is_reg = 0 stop@Epoch 20, train_accuracy = 0.981 val_acc = 0.841 test_acc = 0.834
lr = 0.001 decay = 0.99 momentum = 0.1 is_reg = 1 stop@Epoch 19, train_accuracy = 0.98 val_acc = 0.838 test_acc = 0.828
lr = 0.001 decay = 0.99 momentum = 0.1 is_reg = 0 stop@Epoch 12, train_accuracy = 0.964 val_acc = 0.839 test_acc = 0.825
lr = 0.001 decay = 0.99 momentum = 0.2 is_reg = 1 stop@Epoch 16, train_accuracy = 0.976 val_acc = 0.834 test_acc = 0.819
lr = 0.001 decay = 0.99 momentum = 0.2 is_reg = 0 stop@Epoch 27, train_accuracy = 0.985 val_acc = 0.831 test_acc = 0.831
lr = 0.001 decay = 0.9 momentum = 0 is_reg = 1 stop@Epoch 22, train_accuracy = 0.983 val_acc = 0.822 test_acc = 0.827
lr = 0.001 decay = 0.9 momentum = 0 is_reg = 0 stop@Epoch 13, train_accuracy = 0.971 val_acc = 0.833 test_acc = 0.823
lr = 0.001 decay = 0.9 momentum = 0.1 is_reg = 1 stop@Epoch 20, train_accuracy = 0.981 val_acc = 0.833 test_acc = 0.821
lr = 0.001 decay = 0.9 momentum = 0.1 is_reg = 0 stop@Epoch 19, train_accuracy = 0.982 val_acc = 0.814 test_acc = 0.815
lr = 0.001 decay = 0.9 momentum = 0.2 is_reg = 1 stop@Epoch 19, train_accuracy = 0.982 val_acc = 0.846 test_acc = 0.827
lr = 0.001 decay = 0.9 momentum = 0.2 is_reg = 0 stop@Epoch 10, train_accuracy = 0.956 val_acc = 0.827 test_acc = 0.822
lr = 0.001 decay = 0.8 momentum = 0 is_reg = 1 stop@Epoch 14, train_accuracy = 0.971 val_acc = 0.825 test_acc = 0.817
lr = 0.001 decay = 0.8 momentum = 0 is_reg = 0 stop@Epoch 14, train_accuracy = 0.972 val_acc = 0.826 test_acc = 0.809
lr = 0.001 decay = 0.8 momentum = 0.1 is_reg = 1 stop@Epoch 17, train_accuracy = 0.977 val_acc = 0.839 test_acc = 0.826
lr = 0.001 decay = 0.8 momentum = 0.1 is_reg = 0 stop@Epoch 18, train_accuracy = 0.977 val_acc = 0.828 test_acc = 0.814
lr = 0.001 decay = 0.8 momentum = 0.2 is_reg = 1 stop@Epoch 11, train_accuracy = 0.961 val_acc = 0.822 test_acc = 0.815
lr = 0.001 decay = 0.8 momentum = 0.2 is_reg = 0 stop@Epoch 17, train_accuracy = 0.977 val_acc = 0.813 test_acc = 0.81
lr = 0.0005 decay = 0.99 momentum = 0 is_reg = 1 stop@Epoch 22, train_accuracy = 0.984 val_acc = 0.834 test_acc = 0.833
lr = 0.0005 decay = 0.99 momentum = 0 is_reg = 0 stop@Epoch 12, train_accuracy = 0.966 val_acc = 0.835 test_acc = 0.832
lr = 0.0005 decay = 0.99 momentum = 0.1 is_reg = 1 stop@Epoch 14, train_accuracy = 0.972 val_acc = 0.809 test_acc = 0.824
lr = 0.0005 decay = 0.99 momentum = 0.1 is_reg = 0 stop@Epoch 14, train_accuracy = 0.974 val_acc = 0.816 test_acc = 0.827
lr = 0.0005 decay = 0.99 momentum = 0.2 is_reg = 1 stop@Epoch 10, train_accuracy = 0.954 val_acc = 0.824 test_acc = 0.813
lr = 0.0005 decay = 0.99 momentum = 0.2 is_reg = 0 stop@Epoch 13, train_accuracy = 0.967 val_acc = 0.838 test_acc = 0.832
lr = 0.0005 decay = 0.9 momentum = 0 is_reg = 1 stop@Epoch 26, train_accuracy = 0.987 val_acc = 0.837 test_acc = 0.831
lr = 0.0005 decay = 0.9 momentum = 0 is_reg = 0 stop@Epoch 18, train_accuracy = 0.981 val_acc = 0.831 test_acc = 0.828
lr = 0.0005 decay = 0.9 momentum = 0.1 is_reg = 1 stop@Epoch 17, train_accuracy = 0.98 val_acc = 0.846 test_acc = 0.83
lr = 0.0005 decay = 0.9 momentum = 0.1 is_reg = 0 stop@Epoch 12, train_accuracy = 0.968 val_acc = 0.832 test_acc = 0.825
lr = 0.0005 decay = 0.9 momentum = 0.2 is_reg = 1 stop@Epoch 13, train_accuracy = 0.972 val_acc = 0.842 test_acc = 0.832
lr = 0.0005 decay = 0.9 momentum = 0.2 is_reg = 0 stop@Epoch 13, train_accuracy = 0.971 val_acc = 0.85 test_acc = 0.832
lr = 0.0005 decay = 0.8 momentum = 0 is_reg = 1 stop@Epoch 17, train_accuracy = 0.978 val_acc = 0.824 test_acc = 0.818
lr = 0.0005 decay = 0.8 momentum = 0 is_reg = 0 stop@Epoch 19, train_accuracy = 0.981 val_acc = 0.836 test_acc = 0.819
lr = 0.0005 decay = 0.8 momentum = 0.1 is_reg = 1 stop@Epoch 19, train_accuracy = 0.981 val_acc = 0.844 test_acc = 0.825
lr = 0.0005 decay = 0.8 momentum = 0.1 is_reg = 0 stop@Epoch 18, train_accuracy = 0.98 val_acc = 0.836 test_acc = 0.822
lr = 0.0005 decay = 0.8 momentum = 0.2 is_reg = 1 stop@Epoch 15, train_accuracy = 0.975 val_acc = 0.822 test_acc = 0.816
lr = 0.0005 decay = 0.8 momentum = 0.2 is_reg = 0 stop@Epoch 16, train_accuracy = 0.977 val_acc = 0.826 test_acc = 0.833
使用 EXCEL-2016 的 数据 功能,将上述数据导入,得到表格,在按照stop@Epoch升序或者test_acc降序,找到epoch较小,且test_acc较高的hyperparameter:
如下表中,第2行,第5、6、7均为表现较好的hyperparameter,
lr | = | 0.001 | decay | = | 0.9 | momentum | = | 0.2 | is_reg | = | 0 | stop@Epoch | 10 | train_accuracy | = | 0.956 | val_acc | = | 0.827 | test_acc | = | 0.822 |
lr | = | 0.0005 | decay | = | 0.99 | momentum | = | 0 | is_reg | = | 0 | stop@Epoch | 12 | train_accuracy | = | 0.966 | val_acc | = | 0.835 | test_acc | = | 0.832 |
lr | = | 0.001 | decay | = | 0.99 | momentum | = | 0.1 | is_reg | = | 0 | stop@Epoch | 12 | train_accuracy | = | 0.964 | val_acc | = | 0.839 | test_acc | = | 0.825 |
lr | = | 0.0005 | decay | = | 0.9 | momentum | = | 0.1 | is_reg | = | 0 | stop@Epoch | 12 | train_accuracy | = | 0.968 | val_acc | = | 0.832 | test_acc | = | 0.825 |
lr | = | 0.0005 | decay | = | 0.99 | momentum | = | 0.2 | is_reg | = | 0 | stop@Epoch | 13 | train_accuracy | = | 0.967 | val_acc | = | 0.838 | test_acc | = | 0.832 |
lr | = | 0.0005 | decay | = | 0.9 | momentum | = | 0.2 | is_reg | = | 1 | stop@Epoch | 13 | train_accuracy | = | 0.972 | val_acc | = | 0.842 | test_acc | = | 0.832 |
lr | = | 0.0005 | decay | = | 0.9 | momentum | = | 0.2 | is_reg | = | 0 | stop@Epoch | 13 | train_accuracy | = | 0.971 | val_acc | = | 0.85 | test_acc | = | 0.832 |
lr | = | 0.001 | decay | = | 0.9 | momentum | = | 0 | is_reg | = | 0 | stop@Epoch | 13 | train_accuracy | = | 0.971 | val_acc | = | 0.833 | test_acc | = | 0.823 |
lr | = | 0.0005 | decay | = | 0.99 | momentum | = | 0.1 | is_reg | = | 0 | stop@Epoch | 14 | train_accuracy | = | 0.974 | val_acc | = | 0.816 | test_acc | = | 0.827 |
lr | = | 0.0005 | decay | = | 0.99 | momentum | = | 0.1 | is_reg | = | 1 | stop@Epoch | 14 | train_accuracy | = | 0.972 | val_acc | = | 0.809 | test_acc | = | 0.824 |
lr | = | 0.0005 | decay | = | 0.8 | momentum | = | 0.2 | is_reg | = | 0 | stop@Epoch | 16 | train_accuracy | = | 0.977 | val_acc | = | 0.826 | test_acc | = | 0.833 |
lr | = | 0.0005 | decay | = | 0.9 | momentum | = | 0.1 | is_reg | = | 1 | stop@Epoch | 17 | train_accuracy | = | 0.98 | val_acc | = | 0.846 | test_acc | = | 0.83 |
lr | = | 0.001 | decay | = | 0.8 | momentum | = | 0.1 | is_reg | = | 1 | stop@Epoch | 17 | train_accuracy | = | 0.977 | val_acc | = | 0.839 | test_acc | = | 0.826 |
lr | = | 0.005 | decay | = | 0.99 | momentum | = | 0.1 | is_reg | = | 0 | stop@Epoch | 17 | train_accuracy | = | 0.965 | val_acc | = | 0.829 | test_acc | = | 0.821 |
lr | = | 0.0005 | decay | = | 0.9 | momentum | = | 0 | is_reg | = | 0 | stop@Epoch | 18 | train_accuracy | = | 0.981 | val_acc | = | 0.831 | test_acc | = | 0.828 |
lr | = | 0.005 | decay | = | 0.99 | momentum | = | 0 | is_reg | = | 1 | stop@Epoch | 18 | train_accuracy | = | 0.969 | val_acc | = | 0.835 | test_acc | = | 0.822 |
lr | = | 0.0005 | decay | = | 0.8 | momentum | = | 0.1 | is_reg | = | 0 | stop@Epoch | 18 | train_accuracy | = | 0.98 | val_acc | = | 0.836 | test_acc | = | 0.822 |
lr | = | 0.001 | decay | = | 0.99 | momentum | = | 0 | is_reg | = | 1 | stop@Epoch | 19 | train_accuracy | = | 0.981 | val_acc | = | 0.826 | test_acc | = | 0.833 |
lr | = | 0.001 | decay | = | 0.99 | momentum | = | 0.1 | is_reg | = | 1 | stop@Epoch | 19 | train_accuracy | = | 0.98 | val_acc | = | 0.838 | test_acc | = | 0.828 |
lr | = | 0.001 | decay | = | 0.9 | momentum | = | 0.2 | is_reg | = | 1 | stop@Epoch | 19 | train_accuracy | = | 0.982 | val_acc | = | 0.846 | test_acc | = | 0.827 |
lr | = | 0.0005 | decay | = | 0.8 | momentum | = | 0.1 | is_reg | = | 1 | stop@Epoch | 19 | train_accuracy | = | 0.981 | val_acc | = | 0.844 | test_acc | = | 0.825 |
lr | = | 0.001 | decay | = | 0.99 | momentum | = | 0 | is_reg | = | 0 | stop@Epoch | 20 | train_accuracy | = | 0.981 | val_acc | = | 0.841 | test_acc | = | 0.834 |
lr | = | 0.001 | decay | = | 0.9 | momentum | = | 0.1 | is_reg | = | 1 | stop@Epoch | 20 | train_accuracy | = | 0.981 | val_acc | = | 0.833 | test_acc | = | 0.821 |
lr | = | 0.0005 | decay | = | 0.99 | momentum | = | 0 | is_reg | = | 1 | stop@Epoch | 22 | train_accuracy | = | 0.984 | val_acc | = | 0.834 | test_acc | = | 0.833 |
lr | = | 0.001 | decay | = | 0.9 | momentum | = | 0 | is_reg | = | 1 | stop@Epoch | 22 | train_accuracy | = | 0.983 | val_acc | = | 0.822 | test_acc | = | 0.827 |
lr | = | 0.0005 | decay | = | 0.9 | momentum | = | 0 | is_reg | = | 1 | stop@Epoch | 26 | train_accuracy | = | 0.987 | val_acc | = | 0.837 | test_acc | = | 0.831 |
lr | = | 0.001 | decay | = | 0.99 | momentum | = | 0.2 | is_reg | = | 0 | stop@Epoch | 27 | train_accuracy | = | 0.985 | val_acc | = | 0.831 | test_acc | = | 0.831 |
lr | = | 0.005 | decay | = | 0.99 | momentum | = | 0 | is_reg | = | 0 | stop@Epoch | 28 | train_accuracy | = | 0.981 | val_acc | = | 0.826 | test_acc | = | 0.825 |
结论:
1.加入momentum有利于快速收敛2.其实并不需要所谓的自动调参?手工调几个结果也差不多(╥╯^╰╥):还是说对于这个project不需要,而以后的其他project需要
3.上述results及分析并不完善,比如:
1)所有上述results的model中均有dropout,是否是因为在有dropout的情况下,导致上述results中regularization效果不明显?
2)为何在decay=0.8时,表现明显变差?
3)未完待续