交叉验证测试结果:
current training hidden_size: 400
current training learning_rate: 0.003
current training reg: 0.02
current training batch_size: 500
iteration 0 / 1200: loss 2.302679
iteration 100 / 1200: loss 1.651489
iteration 200 / 1200: loss 1.500087
iteration 300 / 1200: loss 1.391165
iteration 400 / 1200: loss 1.515288
iteration 500 / 1200: loss 1.409726
iteration 600 / 1200: loss 1.450177
iteration 700 / 1200: loss 1.439996
iteration 800 / 1200: loss 1.286857
iteration 900 / 1200: loss 1.289027
iteration 1000 / 1200: loss 1.310876
iteration 1100 / 1200: loss 1.150956
current val_acc: 0.54
best_acc: 0.54
best hidden_size: 400
best learning_rate: 0.003
best reg: 0.02
best batch_size: 500
current training hidden_size: 400
current training learning_rate: 0.003
current training reg: 0.05
current training batch_size: 500
iteration 0 / 1200: loss 2.302859
iteration 100 / 1200: loss 1.761263
iteration 200 / 1200: loss 1.579761
iteration 300 / 1200: loss 1.472029
iteration 400 / 1200: loss 1.458600
iteration 500 / 1200: loss 1.414810
iteration 600 / 1200: loss 1.425350
iteration 700 / 1200: loss 1.366904
iteration 800 / 1200: loss 1.374242
iteration 900 / 1200: loss 1.415730
iteration 1000 / 1200: loss 1.152137
iteration 1100 / 1200: loss 1.198664
current val_acc: 0.514
current training hidden_size: 400
current training learning_rate: 0.003
current training reg: 0.1
current training batch_size: 500
iteration 0 / 1200: loss 2.303143
iteration 100 / 1200: loss 1.722455
iteration 200 / 1200: loss 1.530982
iteration 300 / 1200: loss 1.543712
iteration 400 / 1200: loss 1.400823
iteration 500 / 1200: loss 1.451125
iteration 600 / 1200: loss 1.402639
iteration 700 / 1200: loss 1.476569
iteration 800 / 1200: loss 1.349223
iteration 900 / 1200: loss 1.191459
iteration 1000 / 1200: loss 1.279797
iteration 1100 / 1200: loss 1.268143
current val_acc: 0.509