Q5: Higher Level Representations: Image Features
作业代码已上传至我github: https://github.com/jingshuangliu22/cs231n,欢迎参考、讨论、指正。
features.ipynb
Load data
Extract Features
Done extracting features for 1000 / 49000 images
Done extracting features for 2000 / 49000 images
Done extracting features for 3000 / 49000 images
Done extracting features for 4000 / 49000 images
Done extracting features for 5000 / 49000 images
Done extracting features for 6000 / 49000 images
Done extracting features for 7000 / 49000 images
Done extracting features for 8000 / 49000 images
Done extracting features for 9000 / 49000 images
Done extracting features for 10000 / 49000 images
Done extracting features for 11000 / 49000 images
Done extracting features for 12000 / 49000 images
Done extracting features for 13000 / 49000 images
Done extracting features for 14000 / 49000 images
Done extracting features for 15000 / 49000 images
Done extracting features for 16000 / 49000 images
Done extracting features for 17000 / 49000 images
Done extracting features for 18000 / 49000 images
Done extracting features for 19000 / 49000 images
Done extracting features for 20000 / 49000 images
Done extracting features for 21000 / 49000 images
Done extracting features for 22000 / 49000 images
Done extracting features for 23000 / 49000 images
Done extracting features for 24000 / 49000 images
Done extracting features for 25000 / 49000 images
Done extracting features for 26000 / 49000 images
Done extracting features for 27000 / 49000 images
Done extracting features for 28000 / 49000 images
Done extracting features for 29000 / 49000 images
Done extracting features for 30000 / 49000 images
Done extracting features for 31000 / 49000 images
Done extracting features for 32000 / 49000 images
Done extracting features for 33000 / 49000 images
Done extracting features for 34000 / 49000 images
Done extracting features for 35000 / 49000 images
Done extracting features for 36000 / 49000 images
Done extracting features for 37000 / 49000 images
Done extracting features for 38000 / 49000 images
Done extracting features for 39000 / 49000 images
Done extracting features for 40000 / 49000 images
Done extracting features for 41000 / 49000 images
Done extracting features for 42000 / 49000 images
Done extracting features for 43000 / 49000 images
Done extracting features for 44000 / 49000 images
Done extracting features for 45000 / 49000 images
Done extracting features for 46000 / 49000 images
Done extracting features for 47000 / 49000 images
Done extracting features for 48000 / 49000 images
Train SVM on features
Use the validation set to tune the learning rate and regularization strength
from cs231n.classifiers.linear_classifier import LinearSVM
learning_rates = [1e-9, 1e-8, 1e-7]
regularization_strengths = [1e5, 1e6, 1e7]
results = {}
best_val = -1
best_svm = None
pass
################################################################################
# TODO: #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save #
# the best trained classifer in best_svm. You might also want to play #
# with different numbers of bins in the color histogram. If you are careful #
# you should be able to get accuracy of near 0.44 on the validation set. #
################################################################################
pass
for lr in learning_rates:
for rs in regularization_strengths:
svm = LinearSVM()
svm.train(X_train_feats, y_train, learning_rate=lr, reg=rs,num_iters=1500, verbose=True)
y_train_pred = svm.predict(X_train_feats)
train_acc = (np.mean(y_train == y_train_pred))
y_val_pred = svm.predict(X_val_feats)
valid_acc = (np.mean(y_val == y_val_pred))
results[(lr,rs)] = train_acc,valid_acc
if valid_acc > best_val:
best_val = valid_acc
best_svm = svm
################################################################################
# END OF YOUR CODE #
################################################################################
# Print out results.
for lr, reg in sorted(results):
train_accuracy, val_accuracy = results[(lr, reg)]
print 'lr %e reg %e train accuracy: %f val accuracy: %f' % (
lr, reg, train_accuracy, val_accuracy)
print 'best validation accuracy achieved during cross-validation: %f' % best_val
iteration 0 / 1500: loss 86.964746
iteration 100 / 1500: loss 85.433916
iteration 200 / 1500: loss 83.901478
iteration 300 / 1500: loss 82.417817
iteration 400 / 1500: loss 80.979968
iteration 500 / 1500: loss 79.543434
iteration 600 / 1500: loss 78.147781
iteration 700 / 1500: loss 76.778501
iteration 800 / 1500: loss 75.435538
iteration 900 / 1500: loss 74.130870
iteration 1000 / 1500: loss 72.833356
iteration 1100 / 1500: loss 71.567940
iteration 1200 / 1500: loss 70.330666
iteration 1300 / 1500: loss 69.101906
iteration 1400 / 1500: loss 67.924721
iteration 0 / 1500: loss 799.393635
iteration 100 / 1500: loss 656.061690
iteration 200 / 1500: loss 538.715564
iteration 300 / 1500: loss 442.643978
iteration 400 / 1500: loss 364.011190
iteration 500 / 1500: loss 299.622833
iteration 600 / 1500: loss 246.917134
iteration 700 / 1500: loss 203.777157
iteration 800 / 1500: loss 168.454207
iteration 900 / 1500: loss 139.528692
iteration 1000 / 1500: loss 115.863211
iteration 1100 / 1500: loss 96.479661
iteration 1200 / 1500: loss 80.622802
iteration 1300 / 1500: loss 67.628478
iteration 1400 / 1500: loss 56.998456
iteration 0 / 1500: loss 7322.316997
iteration 100 / 1500: loss 988.834726
iteration 200 / 1500: loss 140.277063
iteration 300 / 1500: loss 26.587858
iteration 400 / 1500: loss 11.356381
iteration 500 / 1500: loss 9.315762
iteration 600 / 1500: loss 9.042241
iteration 700 / 1500: loss 9.005658
iteration 800 / 1500: loss 9.000754
iteration 900 / 1500: loss 9.000099
iteration 1000 / 1500: loss 9.000009
iteration 1100 / 1500: loss 8.999998
iteration 1200 / 1500: loss 8.999996
iteration 1300 / 1500: loss 8.999996
iteration 1400 / 1500: loss 8.999996
iteration 0 / 1500: loss 82.759342
iteration 100 / 1500: loss 69.381527
iteration 200 / 1500: loss 58.430912
iteration 300 / 1500: loss 49.461105
iteration 400 / 1500: loss 42.121674
iteration 500 / 1500: loss 36.118543
iteration 600 / 1500: loss 31.201188
iteration 700 / 1500: loss 27.179788
iteration 800 / 1500: loss 23.885397
iteration 900 / 1500: loss 21.180830
iteration 1000 / 1500: loss 18.971248
iteration 1100 / 1500: loss 17.163788
iteration 1200 / 1500: loss 15.684107
iteration 1300 / 1500: loss 14.474437
iteration 1400 / 1500: loss 13.477690
iteration 0 / 1500: loss 807.651098
iteration 100 / 1500: loss 116.004926
iteration 200 / 1500: loss 23.335039
iteration 300 / 1500: loss 10.920803
iteration 400 / 1500: loss 9.257683
iteration 500 / 1500: loss 9.034487
iteration 600 / 1500: loss 9.004585
iteration 700 / 1500: loss 9.000584
iteration 800 / 1500: loss 9.000047
iteration 900 / 1500: loss 8.999983
iteration 1000 / 1500: loss 8.999959
iteration 1100 / 1500: loss 8.999967
iteration 1200 / 1500: loss 8.999969
iteration 1300 / 1500: loss 8.999968
iteration 1400 / 1500: loss 8.999953
iteration 0 / 1500: loss 8235.373200
iteration 100 / 1500: loss 9.000002
iteration 200 / 1500: loss 8.999997
iteration 300 / 1500: loss 8.999997
iteration 400 / 1500: loss 8.999996
iteration 500 / 1500: loss 8.999996
iteration 600 / 1500: loss 8.999996
iteration 700 / 1500: loss 8.999996
iteration 800 / 1500: loss 8.999997
iteration 900 / 1500: loss 8.999996
iteration 1000 / 1500: loss 8.999997
iteration 1100 / 1500: loss 8.999997
iteration 1200 / 1500: loss 8.999996
iteration 1300 / 1500: loss 8.999997
iteration 1400 / 1500: loss 8.999996
iteration 0 / 1500: loss 89.951948
iteration 100 / 1500: loss 19.847853
iteration 200 / 1500: loss 10.453142
iteration 300 / 1500: loss 9.194420
iteration 400 / 1500: loss 9.025982
iteration 500 / 1500: loss 9.003164
iteration 600 / 1500: loss 9.000103
iteration 700 / 1500: loss 8.999775
iteration 800 / 1500: loss 8.999730
iteration 900 / 1500: loss 8.999641
iteration 1000 / 1500: loss 8.999629
iteration 1100 / 1500: loss 8.999647
iteration 1200 / 1500: loss 8.999621
iteration 1300 / 1500: loss 8.999718
iteration 1400 / 1500: loss 8.999648
iteration 0 / 1500: loss 775.428048
iteration 100 / 1500: loss 8.999969
iteration 200 / 1500: loss 8.999960
iteration 300 / 1500: loss 8.999961
iteration 400 / 1500: loss 8.999968
iteration 500 / 1500: loss 8.999968
iteration 600 / 1500: loss 8.999966
iteration 700 / 1500: loss 8.999969
iteration 800 / 1500: loss 8.999963
iteration 900 / 1500: loss 8.999969
iteration 1000 / 1500: loss 8.999966
iteration 1100 / 1500: loss 8.999966
iteration 1200 / 1500: loss 8.999958
iteration 1300 / 1500: loss 8.999981
iteration 1400 / 1500: loss 8.999965
iteration 0 / 1500: loss 7426.789686
iteration 100 / 1500: loss 8.999999
iteration 200 / 1500: loss 9.000000
iteration 300 / 1500: loss 9.000000
iteration 400 / 1500: loss 9.000000
iteration 500 / 1500: loss 9.000001
iteration 600 / 1500: loss 9.000000
iteration 700 / 1500: loss 8.999999
iteration 800 / 1500: loss 9.000000
iteration 900 / 1500: loss 8.999999
iteration 1000 / 1500: loss 9.000000
iteration 1100 / 1500: loss 9.000001
iteration 1200 / 1500: loss 9.000001
iteration 1300 / 1500: loss 9.000001
iteration 1400 / 1500: loss 9.000001
lr 1.000000e-09 reg 1.000000e+05 train accuracy: 0.097939 val accuracy: 0.085000
lr 1.000000e-09 reg 1.000000e+06 train accuracy: 0.100755 val accuracy: 0.107000
lr 1.000000e-09 reg 1.000000e+07 train accuracy: 0.414163 val accuracy: 0.426000
lr 1.000000e-08 reg 1.000000e+05 train accuracy: 0.101796 val accuracy: 0.101000
lr 1.000000e-08 reg 1.000000e+06 train accuracy: 0.417449 val accuracy: 0.416000
lr 1.000000e-08 reg 1.000000e+07 train accuracy: 0.391143 val accuracy: 0.374000
lr 1.000000e-07 reg 1.000000e+05 train accuracy: 0.414245 val accuracy: 0.419000
lr 1.000000e-07 reg 1.000000e+06 train accuracy: 0.403694 val accuracy: 0.407000
lr 1.000000e-07 reg 1.000000e+07 train accuracy: 0.345694 val accuracy: 0.352000
best validation accuracy achieved during cross-validation: 0.426000
0.431
(49000, 155)
from cs231n.classifiers.neural_net import TwoLayerNet
input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10
net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None
learning_rate_choice = [1.8,1.7, 1.6, 1.5]
reg_choice = [0.01, 0.011]
batch_size_choice = [1024,2048]
num_iters_curr = 1500
best_acc = -1
best_stats = None
for batch_size_curr in batch_size_choice:
for reg_cur in reg_choice:
for learning_rate_curr in learning_rate_choice:
print "current training learning_rate:",learning_rate_curr
print "current training reg:",reg_cur
print "current training batch_size:",batch_size_curr
net = TwoLayerNet(input_dim, hidden_dim, num_classes)
stats = net.train(X_train_feats, y_train, X_val_feats, y_val,
num_iters=num_iters_curr, batch_size=batch_size_curr,
learning_rate=learning_rate_curr, learning_rate_decay=0.95,
reg=reg_cur, verbose=True)
val_acc = (net.predict(X_val_feats) == y_val).mean()
print "current val_acc:",val_acc
if val_acc>best_acc:
best_acc = val_acc
best_net = net
best_stats = stats
print "best_acc:",best_acc
print "best learning_rate:",best_net.hyper_params['learning_rate']
print "best reg:",best_net.hyper_params['reg']
print "best batch_size:",best_net.hyper_params['batch_size']
current training learning_rate: 1.8
current training reg: 0.01
current training batch_size: 1024
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.500305
iteration 200 / 1500: loss 1.529890
iteration 300 / 1500: loss 1.380687
iteration 400 / 1500: loss 1.460466
iteration 500 / 1500: loss 1.428169
iteration 600 / 1500: loss 1.472154
iteration 700 / 1500: loss 1.430816
iteration 800 / 1500: loss 1.398814
iteration 900 / 1500: loss 1.428572
iteration 1000 / 1500: loss 1.455034
iteration 1100 / 1500: loss 1.390804
iteration 1200 / 1500: loss 1.447808
iteration 1300 / 1500: loss 1.364305
iteration 1400 / 1500: loss 1.406287
current val_acc: 0.559
best_acc: 0.559
best learning_rate: 1.8
best reg: 0.01
best batch_size: 1024
current training learning_rate: 1.7
current training reg: 0.01
current training batch_size: 1024
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.540023
iteration 200 / 1500: loss 1.480771
iteration 300 / 1500: loss 1.434544
iteration 400 / 1500: loss 1.438121
iteration 500 / 1500: loss 1.436991
iteration 600 / 1500: loss 1.454054
iteration 700 / 1500: loss 1.416321
iteration 800 / 1500: loss 1.399309
iteration 900 / 1500: loss 1.411744
iteration 1000 / 1500: loss 1.392650
iteration 1100 / 1500: loss 1.398244
iteration 1200 / 1500: loss 1.389063
iteration 1300 / 1500: loss 1.395841
iteration 1400 / 1500: loss 1.405926
current val_acc: 0.568
best_acc: 0.568
best learning_rate: 1.7
best reg: 0.01
best batch_size: 1024
current training learning_rate: 1.6
current training reg: 0.01
current training batch_size: 1024
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.470494
iteration 200 / 1500: loss 1.506132
iteration 300 / 1500: loss 1.462313
iteration 400 / 1500: loss 1.452474
iteration 500 / 1500: loss 1.489785
iteration 600 / 1500: loss 1.397515
iteration 700 / 1500: loss 1.367555
iteration 800 / 1500: loss 1.402279
iteration 900 / 1500: loss 1.440599
iteration 1000 / 1500: loss 1.379100
iteration 1100 / 1500: loss 1.421882
iteration 1200 / 1500: loss 1.464033
iteration 1300 / 1500: loss 1.441130
iteration 1400 / 1500: loss 1.410320
current val_acc: 0.573
best_acc: 0.573
best learning_rate: 1.6
best reg: 0.01
best batch_size: 1024
current training learning_rate: 1.5
current training reg: 0.01
current training batch_size: 1024
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.546656
iteration 200 / 1500: loss 1.486320
iteration 300 / 1500: loss 1.401493
iteration 400 / 1500: loss 1.442899
iteration 500 / 1500: loss 1.470813
iteration 600 / 1500: loss 1.444808
iteration 700 / 1500: loss 1.429762
iteration 800 / 1500: loss 1.414738
iteration 900 / 1500: loss 1.369698
iteration 1000 / 1500: loss 1.409698
iteration 1100 / 1500: loss 1.384700
iteration 1200 / 1500: loss 1.406450
iteration 1300 / 1500: loss 1.402708
iteration 1400 / 1500: loss 1.382739
current val_acc: 0.571
current training learning_rate: 1.8
current training reg: 0.011
current training batch_size: 1024
iteration 0 / 1500: loss 2.302590
iteration 100 / 1500: loss 1.563979
iteration 200 / 1500: loss 1.515331
iteration 300 / 1500: loss 1.492799
iteration 400 / 1500: loss 1.435600
iteration 500 / 1500: loss 1.442067
iteration 600 / 1500: loss 1.437993
iteration 700 / 1500: loss 1.380245
iteration 800 / 1500: loss 1.431763
iteration 900 / 1500: loss 1.456615
iteration 1000 / 1500: loss 1.384013
iteration 1100 / 1500: loss 1.374626
iteration 1200 / 1500: loss 1.435926
iteration 1300 / 1500: loss 1.413065
iteration 1400 / 1500: loss 1.417783
current val_acc: 0.56
current training learning_rate: 1.7
current training reg: 0.011
current training batch_size: 1024
iteration 0 / 1500: loss 2.302590
iteration 100 / 1500: loss 1.523889
iteration 200 / 1500: loss 1.459294
iteration 300 / 1500: loss 1.547960
iteration 400 / 1500: loss 1.446798
iteration 500 / 1500: loss 1.488612
iteration 600 / 1500: loss 1.468087
iteration 700 / 1500: loss 1.446866
iteration 800 / 1500: loss 1.441728
iteration 900 / 1500: loss 1.473999
iteration 1000 / 1500: loss 1.379939
iteration 1100 / 1500: loss 1.436082
iteration 1200 / 1500: loss 1.445550
iteration 1300 / 1500: loss 1.396983
iteration 1400 / 1500: loss 1.477185
current val_acc: 0.56
current training learning_rate: 1.6
current training reg: 0.011
current training batch_size: 1024
iteration 0 / 1500: loss 2.302590
iteration 100 / 1500: loss 1.620983
iteration 200 / 1500: loss 1.465891
iteration 300 / 1500: loss 1.508626
iteration 400 / 1500: loss 1.507472
iteration 500 / 1500: loss 1.501891
iteration 600 / 1500: loss 1.469078
iteration 700 / 1500: loss 1.415572
iteration 800 / 1500: loss 1.430231
iteration 900 / 1500: loss 1.428609
iteration 1000 / 1500: loss 1.438212
iteration 1100 / 1500: loss 1.461607
iteration 1200 / 1500: loss 1.407656
iteration 1300 / 1500: loss 1.403149
iteration 1400 / 1500: loss 1.509024
current val_acc: 0.551
current training learning_rate: 1.5
current training reg: 0.011
current training batch_size: 1024
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.560552
iteration 200 / 1500: loss 1.471533
iteration 300 / 1500: loss 1.495598
iteration 400 / 1500: loss 1.480390
iteration 500 / 1500: loss 1.458200
iteration 600 / 1500: loss 1.410563
iteration 700 / 1500: loss 1.403763
iteration 800 / 1500: loss 1.459880
iteration 900 / 1500: loss 1.390115
iteration 1000 / 1500: loss 1.444417
iteration 1100 / 1500: loss 1.459145
iteration 1200 / 1500: loss 1.421502
iteration 1300 / 1500: loss 1.411841
iteration 1400 / 1500: loss 1.435132
current val_acc: 0.569
current training learning_rate: 1.8
current training reg: 0.01
current training batch_size: 2048
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.503050
iteration 200 / 1500: loss 1.512787
iteration 300 / 1500: loss 1.419264
iteration 400 / 1500: loss 1.386965
iteration 500 / 1500: loss 1.456314
iteration 600 / 1500: loss 1.371561
iteration 700 / 1500: loss 1.367006
iteration 800 / 1500: loss 1.397520
iteration 900 / 1500: loss 1.393192
iteration 1000 / 1500: loss 1.401441
iteration 1100 / 1500: loss 1.405192
iteration 1200 / 1500: loss 1.375592
iteration 1300 / 1500: loss 1.354200
iteration 1400 / 1500: loss 1.348964
current val_acc: 0.571
current training learning_rate: 1.7
current training reg: 0.01
current training batch_size: 2048
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.500972
iteration 200 / 1500: loss 1.451483
iteration 300 / 1500: loss 1.450264
iteration 400 / 1500: loss 1.443042
iteration 500 / 1500: loss 1.394450
iteration 600 / 1500: loss 1.381958
iteration 700 / 1500: loss 1.391028
iteration 800 / 1500: loss 1.387771
iteration 900 / 1500: loss 1.400622
iteration 1000 / 1500: loss 1.372487
iteration 1100 / 1500: loss 1.375760
iteration 1200 / 1500: loss 1.394658
iteration 1300 / 1500: loss 1.396033
iteration 1400 / 1500: loss 1.403560
current val_acc: 0.563
current training learning_rate: 1.6
current training reg: 0.01
current training batch_size: 2048
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.570068
iteration 200 / 1500: loss 1.416525
iteration 300 / 1500: loss 1.451739
iteration 400 / 1500: loss 1.436536
iteration 500 / 1500: loss 1.410322
iteration 600 / 1500: loss 1.351524
iteration 700 / 1500: loss 1.439125
iteration 800 / 1500: loss 1.364394
iteration 900 / 1500: loss 1.408162
iteration 1000 / 1500: loss 1.387432
iteration 1100 / 1500: loss 1.368569
iteration 1200 / 1500: loss 1.403514
iteration 1300 / 1500: loss 1.396555
iteration 1400 / 1500: loss 1.374414
current val_acc: 0.564
current training learning_rate: 1.5
current training reg: 0.01
current training batch_size: 2048
iteration 0 / 1500: loss 2.302589
iteration 100 / 1500: loss 1.509225
iteration 200 / 1500: loss 1.469627
iteration 300 / 1500: loss 1.440074
iteration 400 / 1500: loss 1.427931
iteration 500 / 1500: loss 1.387515
iteration 600 / 1500: loss 1.399713
iteration 700 / 1500: loss 1.416284
iteration 800 / 1500: loss 1.451338
iteration 900 / 1500: loss 1.395987
iteration 1000 / 1500: loss 1.391280
iteration 1100 / 1500: loss 1.372650
iteration 1200 / 1500: loss 1.409513
iteration 1300 / 1500: loss 1.350978
iteration 1400 / 1500: loss 1.371274
current val_acc: 0.564
current training learning_rate: 1.8
current training reg: 0.011
current training batch_size: 2048
iteration 0 / 1500: loss 2.302590
iteration 100 / 1500: loss 1.503954
iteration 200 / 1500: loss 1.528570
iteration 300 / 1500: loss 1.437264
iteration 400 / 1500: loss 1.457583
iteration 500 / 1500: loss 1.430647
iteration 600 / 1500: loss 1.439845
iteration 700 / 1500: loss 1.405331
iteration 800 / 1500: loss 1.409905
iteration 900 / 1500: loss 1.401408
iteration 1000 / 1500: loss 1.430758
iteration 1100 / 1500: loss 1.393267
iteration 1200 / 1500: loss 1.417903
iteration 1300 / 1500: loss 1.415154
iteration 1400 / 1500: loss 1.389908
current val_acc: 0.558
current training learning_rate: 1.7
current training reg: 0.011
current training batch_size: 2048
iteration 0 / 1500: loss 2.302590
iteration 100 / 1500: loss 1.559359
iteration 200 / 1500: loss 1.499529
iteration 300 / 1500: loss 1.466741
iteration 400 / 1500: loss 1.411526
iteration 500 / 1500: loss 1.425862
iteration 600 / 1500: loss 1.426498
iteration 700 / 1500: loss 1.440676
iteration 800 / 1500: loss 1.400584
iteration 900 / 1500: loss 1.431244
iteration 1000 / 1500: loss 1.437300
iteration 1100 / 1500: loss 1.404868
iteration 1200 / 1500: loss 1.428941
iteration 1300 / 1500: loss 1.373555
iteration 1400 / 1500: loss 1.387955
current val_acc: 0.557
current training learning_rate: 1.6
current training reg: 0.011
current training batch_size: 2048
iteration 0 / 1500: loss 2.302590
iteration 100 / 1500: loss 1.468813
iteration 200 / 1500: loss 1.505506
iteration 300 / 1500: loss 1.416367
iteration 400 / 1500: loss 1.464052
iteration 500 / 1500: loss 1.410479
iteration 600 / 1500: loss 1.409553
iteration 700 / 1500: loss 1.426082
iteration 800 / 1500: loss 1.429986
iteration 900 / 1500: loss 1.444137
iteration 1000 / 1500: loss 1.401925
iteration 1100 / 1500: loss 1.416147
iteration 1200 / 1500: loss 1.452663
iteration 1300 / 1500: loss 1.411526
iteration 1400 / 1500: loss 1.391227
current val_acc: 0.555
current training learning_rate: 1.5
current training reg: 0.011
current training batch_size: 2048
iteration 0 / 1500: loss 2.302590
iteration 100 / 1500: loss 1.476053
iteration 200 / 1500: loss 1.484842
iteration 300 / 1500: loss 1.463245
iteration 400 / 1500: loss 1.415583
iteration 500 / 1500: loss 1.417196
iteration 600 / 1500: loss 1.440323
iteration 700 / 1500: loss 1.443371
iteration 800 / 1500: loss 1.414975
iteration 900 / 1500: loss 1.405715
iteration 1000 / 1500: loss 1.404263
iteration 1100 / 1500: loss 1.407797
iteration 1200 / 1500: loss 1.404061
iteration 1300 / 1500: loss 1.417821
iteration 1400 / 1500: loss 1.373487
current val_acc: 0.554
0.546