要用神经网络来处理图像相关的东西,就必须用到卷积神经网络,为什么是卷积呢?首先得了解卷积运算。
卷积运算不是基本运算,来看一下公式:
y
=
∫
−
∞
∞
f
(
τ
)
g
(
x
−
τ
)
d
τ
y=\int _{-\infty }^{\infty }f(\tau )g(x-\tau )\,\mathrm {d} \tau
y=∫−∞∞f(τ)g(x−τ)dτ
这个公式包含两个函数 f ( x ) 和 g ( x ) f(x)和g(x) f(x)和g(x), τ \tau τ为被积分变量,
所以在 X − O − Y X-O-Y X−O−Y轴上,每一个 x x x都对应一次积分结果, g ( x − τ ) g(x-\tau) g(x−τ)中每一次 x x x的变化都代表着这个函数位移了一下,然后与 f ( x ) f(x) f(x)相乘再在整个平面积分。我们假设 f ( x ) 和 g ( x ) f(x)和g(x) f(x)和g(x)在平面上都不是都有定义,只有一小部分区间有定义,那么,随着x的变化, f ( x ) 和 g ( x ) f(x)和g(x) f(x)和g(x)肯定就有定义区间重复的那部分,所以积分有数值也在他们重叠的部分,
为什么要相乘?
因为相乘能够检测两个函数有定义的部分是否重叠,相加则不行,相乘的数值越大,也就表明了在这个x的情况下两者的重合程度越高。
我们假设 f ( x ) f(x) f(x)是一个画放在纸上星星, g ( x ) g(x) g(x)比作一个可以移动的放大镜,那么卷积就相当于放大镜在纸上移动,并在能看到星星的时候放大它。
下面的这个图摘自维基百科,具体大家去看它上面的演示动画就明白了。
维基百科卷积
清楚了这个之后,神经网络就不难理解了。
神经网络中,
f
(
x
)
f(x)
f(x)是待识别的图像的RGB离散数值,而
g
(
x
)
g(x)
g(x)是神经元(放大镜),每个神经元都是功能不同的放大镜,可以去扫描图片上不同的特征,以一只猫为例,有的能放大猫的眼睛,有的放大猫的毛发,网络的训练就是不断尝试不同的放大镜,一直找到最适合发现这只猫的一组放大镜为止,就算训练完成了。
##本次代码使用tf的自定义层的功能来搭建一个网络,自定义的层都是全连接层,并非卷积层,下一次我将截式卷积层的代码
##注:参数的变大,x的预处理范围都会影响的正确率
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
from tensorflow import keras
def preprocess(x, y):
# [0~255] => [-1~1]
x = 2 * tf.cast(x, dtype=tf.float32) / 255. - 1.
y = tf.cast(y, dtype=tf.int32)
return x,y
batchsz = 128
# [50k, 32, 32, 3], [10k, 1]
(x, y), (x_val, y_val) = datasets.cifar10.load_data()
y = tf.squeeze(y)
y_val = tf.squeeze(y_val)
y = tf.one_hot(y, depth=10) # [50k, 10]
y_val = tf.one_hot(y_val, depth=10) # [10k, 10]
print('datasets:', x.shape, y.shape, x_val.shape, y_val.shape, x.min(), x.max())
train_db = tf.data.Dataset.from_tensor_slices((x,y))
train_db = train_db.map(preprocess).shuffle(10000).batch(batchsz).repeat()
test_db = tf.data.Dataset.from_tensor_slices((x_val, y_val))
test_db = test_db.map(preprocess).batch(batchsz)
sample = next(iter(train_db))
print('batch:', sample[0].shape, sample[1].shape)
class MyDense(layers.Layer): ##继承自layers包的Layer(实现自定义层)
# to replace standard layers.Dense()
def __init__(self, inp_dim, outp_dim):
super(MyDense, self).__init__() #引用父类的初始化函数
self.kernel = self.add_variable('w', [inp_dim, outp_dim]) ##将W设置为可训练变量,后面的两个参数为w的shape,也就是这个矩阵的行列数
# self.bias = self.add_variable('b', [outp_dim]) #如果不需要偏置,可以注释掉
def call(self, inputs, training=None): #training参数写none表示不固定,可以用于训练,也可用于预测
x = inputs @ self.kernel ##传输表达式,用于前向传播
return x
class MyNetwork(keras.Model): ## 继承自Model(实现自定义网络)
def __init__(self):
super(MyNetwork, self).__init__()
self.fc1 = MyDense(32*32*3, 256)
self.fc2 = MyDense(256, 128)
self.fc3 = MyDense(128, 64)
self.fc4 = MyDense(64, 32)
self.fc5 = MyDense(32, 10)
def call(self, inputs, training=None): #计算逻辑
"""
:param inputs: [b, 32, 32, 3]
:param training:
:return:
"""
x = tf.reshape(inputs, [-1, 32*32*3])
# [b, 32*32*3] => [b, 256]
x = self.fc1(x)
x = tf.nn.relu(x)
# [b, 256] => [b, 128]
x = self.fc2(x)
x = tf.nn.relu(x)
# [b, 128] => [b, 64]
x = self.fc3(x)
x = tf.nn.relu(x)
# [b, 64] => [b, 32]
x = self.fc4(x)
x = tf.nn.relu(x)
# [b, 32] => [b, 10]
x = self.fc5(x)
return x
network = MyNetwork() #载入网络
network.compile(optimizer=optimizers.Adam(lr=1e-3),##这个参数设置过大可能导致梯度离散
loss=tf.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy']) #设置优化器,损失函数,评价指标(用于测试的时候测量)
network.fit(train_db, epochs=15, validation_data=test_db, validation_freq=1, steps_per_epoch=x.shape[0]//batchsz)#validation_freq=1表示 训练一个train_db做一次测试
#%% 这些是模型保存的方法,无关紧要
network.evaluate(test_db)
network.save_weights('ckpt/weights.ckpt')
del network
print('saved to ckpt/weights.ckpt')
network = MyNetwork()
network.compile(optimizer=optimizers.Adam(lr=1e-3),
loss=tf.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
network.load_weights('ckpt/weights.ckpt')
print('loaded weights from file.')
network.evaluate(test_db)```
以下是卷积神经网络VGG-13训练cifar100的代码:
结果贴在最后,50个epoch只能达到30%的准确度,低得可怜…
##利用卷积神经网络VGG13训练cifar100
import tensorflow as tf
from tensorflow.keras import layers, optimizers, datasets, Sequential
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
tf.random.set_seed(2345)
conv_layers = [ # 5 units of conv + max pooling 自定义的卷积层
# unit 1
layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), #参数:核数量,核大小,激活函数
layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 2
layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 3
layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 4
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 5
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same')
]
def preprocess(x, y):
# [0~1]
x = 2*tf.cast(x, dtype=tf.float32) / 255.-1
y = tf.cast(y, dtype=tf.int32)
return x,y
(x,y), (x_test, y_test) = datasets.cifar100.load_data()
y = tf.squeeze(y, axis=1)
y_test = tf.squeeze(y_test, axis=1)
print(x.shape, y.shape, x_test.shape, y_test.shape)
train_db = tf.data.Dataset.from_tensor_slices((x,y))
train_db = train_db.shuffle(1000).map(preprocess).batch(128)
test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test))
test_db = test_db.map(preprocess).batch(64)
sample = next(iter(train_db))
print('sample:', sample[0].shape, sample[1].shape,
tf.reduce_min(sample[0]), tf.reduce_max(sample[0]))
def main():
# [b, 32, 32, 3] => [b, 1, 1, 512]
conv_net = Sequential(conv_layers)
fc_net = Sequential([ # 第二个容器:全连接层,用于最后输出
layers.Dense(256, activation=tf.nn.relu),
layers.Dense(128, activation=tf.nn.relu),
layers.Dense(100, activation=None),
])
conv_net.build(input_shape=[None, 32, 32, 3]) #用于规定输入参数的shape
fc_net.build(input_shape=[None, 512])
conv_net.summary()
fc_net.summary()
optimizer = optimizers.Adam(lr=1e-4) #选定优化器和学习率
# [1, 2] + [3, 4] => [1, 2, 3, 4]
variables = conv_net.trainable_variables + fc_net.trainable_variables # 将所有的可训练参数连接起来,便于后面求梯度
## 求梯度
for epoch in range(50):
for step, (x,y) in enumerate(train_db): #在数据集中迭代
with tf.GradientTape() as tape:
# [b, 32, 32, 3] => [b, 1, 1, 512]
out = conv_net(x)
# flatten, => [b, 512]
out = tf.reshape(out, [-1, 512])
# [b, 512] => [b, 10]
logits = fc_net(out)
# [b] => [b, 10]
y_onehot = tf.one_hot(y, depth=100)
# compute loss
loss = tf.losses.categorical_crossentropy(y_onehot, logits, from_logits=True)
loss = tf.reduce_mean(loss)
grads = tape.gradient(loss, variables) # 求梯度
optimizer.apply_gradients(zip(grads, variables)) # 通过梯度优化参数
if step %100 == 0:
print(epoch, step, 'loss:', float(loss))
## 每次训练完数据集一个epoch之后做正确度统计
total_num = 0
total_correct = 0
for x,y in test_db:#使用测试机循环
out = conv_net(x) #用模型预测,先进入卷积层
out = tf.reshape(out, [-1, 512]) #调整shape
logits = fc_net(out) #用模型预测,再进入全连接层
prob = tf.nn.softmax(logits, axis=1) # 变到0~1之间
pred = tf.argmax(prob, axis=1) # 最大的那个数变为1,其余变为0,为了和后面的y-onehot匹配
pred = tf.cast(pred, dtype=tf.int32)
correct = tf.cast(tf.equal(pred, y), dtype=tf.int32)
correct = tf.reduce_sum(correct)
# print('correct= ', int(correct))
total_num += x.shape[0]
total_correct += int(correct)
acc = total_correct / total_num
print(epoch, 'acc:', acc)
if __name__ == '__main__':
main()
# Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
# 169009152/169001437 [==============================] - 3s 0us/step
# (50000, 32, 32, 3) (50000,) (10000, 32, 32, 3) (10000,)
# sample: (128, 32, 32, 3) (128,) tf.Tensor(-1.0, shape=(), dtype=float32) tf.Tensor(1.0, shape=(), dtype=float32)
# Model: "sequential"
# _________________________________________________________________
# Layer (type) Output Shape Param #
# =================================================================
# conv2d (Conv2D) (None, 32, 32, 64) 1792
# _________________________________________________________________
# conv2d_1 (Conv2D) (None, 32, 32, 64) 36928
# _________________________________________________________________
# max_pooling2d (MaxPooling2D) (None, 16, 16, 64) 0
# _________________________________________________________________
# conv2d_2 (Conv2D) (None, 16, 16, 128) 73856
# _________________________________________________________________
# conv2d_3 (Conv2D) (None, 16, 16, 128) 147584
# _________________________________________________________________
# max_pooling2d_1 (MaxPooling2 (None, 8, 8, 128) 0
# _________________________________________________________________
# conv2d_4 (Conv2D) (None, 8, 8, 256) 295168
# _________________________________________________________________
# conv2d_5 (Conv2D) (None, 8, 8, 256) 590080
# _________________________________________________________________
# max_pooling2d_2 (MaxPooling2 (None, 4, 4, 256) 0
# _________________________________________________________________
# conv2d_6 (Conv2D) (None, 4, 4, 512) 1180160
# _________________________________________________________________
# conv2d_7 (Conv2D) (None, 4, 4, 512) 2359808
# _________________________________________________________________
# max_pooling2d_3 (MaxPooling2 (None, 2, 2, 512) 0
# _________________________________________________________________
# conv2d_8 (Conv2D) (None, 2, 2, 512) 2359808
# _________________________________________________________________
# conv2d_9 (Conv2D) (None, 2, 2, 512) 2359808
# _________________________________________________________________
# max_pooling2d_4 (MaxPooling2 (None, 1, 1, 512) 0
# =================================================================
# Total params: 9,404,992
# Trainable params: 9,404,992
# Non-trainable params: 0
# _________________________________________________________________
# Model: "sequential_1"
# _________________________________________________________________
# Layer (type) Output Shape Param #
# =================================================================
# dense (Dense) (None, 256) 131328
# _________________________________________________________________
# dense_1 (Dense) (None, 128) 32896
# _________________________________________________________________
# dense_2 (Dense) (None, 100) 12900
# =================================================================
# Total params: 177,124
# Trainable params: 177,124
# Non-trainable params: 0
# _________________________________________________________________
# 0 0 loss: 4.605561256408691
# 0 100 loss: 4.5840582847595215
# 0 200 loss: 4.275364875793457
# 0 300 loss: 4.344236373901367
# 0 acc: 0.0882
# 1 0 loss: 4.004591941833496
# 1 100 loss: 3.8594717979431152
# 1 200 loss: 3.6697750091552734
# 1 300 loss: 3.8643999099731445
# 1 acc: 0.1445
# 2 0 loss: 3.8087716102600098
# 2 100 loss: 3.762399196624756
# 2 200 loss: 3.540567636489868
# 2 300 loss: 3.461308002471924
# 2 acc: 0.183
# 3 0 loss: 3.4499244689941406
# 3 100 loss: 3.518815755844116
# 3 200 loss: 3.2574617862701416
# 3 300 loss: 3.011989116668701
# 3 acc: 0.2129
# 4 0 loss: 3.3798604011535645
# 4 100 loss: 3.2031750679016113
# 4 200 loss: 3.2139480113983154
# 4 300 loss: 3.126434087753296
# 4 acc: 0.2413
# 5 0 loss: 3.2381911277770996
# 5 100 loss: 2.8200480937957764
# 5 200 loss: 2.778794527053833
# 5 300 loss: 2.919658660888672
# 5 acc: 0.2764
# 6 0 loss: 2.7757792472839355
# 6 100 loss: 2.870513439178467
# 6 200 loss: 2.723210334777832
# 6 300 loss: 2.954850912094116
# 6 acc: 0.2958
# 7 0 loss: 2.875883102416992
# 7 100 loss: 2.5324206352233887
# 7 200 loss: 2.29622483253479
# 7 300 loss: 2.4723737239837646
# 7 acc: 0.3223
# 8 0 loss: 2.3187077045440674
# 8 100 loss: 2.475510358810425
# 8 200 loss: 2.3975989818573
# 8 300 loss: 2.4141554832458496
# 8 acc: 0.3296
# 9 0 loss: 2.4524223804473877
# 9 100 loss: 2.233828544616699
# 9 200 loss: 2.429077386856079
# 9 300 loss: 1.9900705814361572
# 9 acc: 0.3408
# 10 0 loss: 2.1611061096191406
# 10 100 loss: 2.060581922531128
# 10 200 loss: 1.8641955852508545
# 10 300 loss: 1.761167049407959
# 10 acc: 0.3537
# 11 0 loss: 1.9053047895431519
# 11 100 loss: 1.8212822675704956
# 11 200 loss: 1.548974633216858
# 11 300 loss: 1.7239539623260498
# 11 acc: 0.3465
# 12 0 loss: 1.6095221042633057
# 12 100 loss: 1.2763440608978271
# 12 200 loss: 1.5981580018997192
# 12 300 loss: 1.2417281866073608
# 12 acc: 0.3402
# 13 0 loss: 1.2237718105316162
# 13 100 loss: 0.949530839920044
# 13 200 loss: 1.0324599742889404
# 13 300 loss: 0.899419367313385
# 13 acc: 0.3434
# 14 0 loss: 1.0063848495483398
# 14 100 loss: 0.5840092301368713
# 14 200 loss: 0.6796853542327881
# 14 300 loss: 0.57671058177948
# 14 acc: 0.328
# 15 0 loss: 0.7236818075180054
# 15 100 loss: 0.49495500326156616
# 15 200 loss: 0.5753574371337891
# 15 300 loss: 0.47212672233581543
# 15 acc: 0.3388
# 16 0 loss: 0.4639812111854553
# 16 100 loss: 0.2529522180557251
# 16 200 loss: 0.3322717845439911
# 16 300 loss: 0.3659853935241699
# 16 acc: 0.3298
# 17 0 loss: 0.4683800935745239
# 17 100 loss: 0.23362717032432556
# 17 200 loss: 0.2556360363960266
# 17 300 loss: 0.2710829973220825
# 17 acc: 0.3309
# 18 0 loss: 0.2542823851108551
# 18 100 loss: 0.23534712195396423
# 18 200 loss: 0.2396312654018402
# 18 300 loss: 0.12760430574417114
# 18 acc: 0.325
# 19 0 loss: 0.3296607434749603
# 19 100 loss: 0.2508047819137573
# 19 200 loss: 0.14986690878868103
# 19 300 loss: 0.21476638317108154
# 19 acc: 0.3281
# 20 0 loss: 0.24828244745731354
# 20 100 loss: 0.17237189412117004
# 20 200 loss: 0.2487848401069641
# 20 300 loss: 0.11386354267597198
# 20 acc: 0.3295
# 21 0 loss: 0.17818447947502136
# 21 100 loss: 0.2029731571674347
# 21 200 loss: 0.1304931342601776
# 21 300 loss: 0.15854015946388245
# 21 acc: 0.3226
# 22 0 loss: 0.3360269367694855
# 22 100 loss: 0.09969679266214371
# 22 200 loss: 0.14053449034690857
# 22 300 loss: 0.12451587617397308
# 22 acc: 0.3308
# 23 0 loss: 0.10183759033679962
# 23 100 loss: 0.1273515820503235
# 23 200 loss: 0.0658843070268631
# 23 300 loss: 0.13061876595020294
# 23 acc: 0.3312
# 24 0 loss: 0.10131612420082092
# 24 100 loss: 0.12814220786094666
# 24 200 loss: 0.18455331027507782
# 24 300 loss: 0.28138798475265503
# 24 acc: 0.3282
# 25 0 loss: 0.20154546201229095
# 25 100 loss: 0.2526904344558716
# 25 200 loss: 0.07851159572601318
# 25 300 loss: 0.08871728181838989
# 25 acc: 0.3357
# 26 0 loss: 0.18907761573791504
# 26 100 loss: 0.10460161417722702
# 26 200 loss: 0.054584719240665436
# 26 300 loss: 0.1553255170583725
# 26 acc: 0.3375
# 27 0 loss: 0.09388221055269241
# 27 100 loss: 0.15093109011650085
# 27 200 loss: 0.09950651228427887
# 27 300 loss: 0.04773459956049919
# 27 acc: 0.3315
# 28 0 loss: 0.15606489777565002
# 28 100 loss: 0.1273258924484253
# 28 200 loss: 0.2766129672527313
# 28 300 loss: 0.041044771671295166
# 28 acc: 0.3265
# 29 0 loss: 0.3127376437187195
# 29 100 loss: 0.14141681790351868
# 29 200 loss: 0.12185752391815186
# 29 300 loss: 0.15350860357284546
# 29 acc: 0.335
# 30 0 loss: 0.15822923183441162
# 30 100 loss: 0.05317750573158264
# 30 200 loss: 0.11602228879928589
# 30 300 loss: 0.1883474886417389
# 30 acc: 0.3332
# 31 0 loss: 0.08587311208248138
# 31 100 loss: 0.08658549189567566
# 31 200 loss: 0.1874590963125229
# 31 300 loss: 0.08177060633897781
# 31 acc: 0.3261
# 32 0 loss: 0.1298086941242218
# 32 100 loss: 0.06522954255342484
# 32 200 loss: 0.09495490044355392
# 32 300 loss: 0.05367262661457062
# 32 acc: 0.3306
# 33 0 loss: 0.17908413708209991
# 33 100 loss: 0.04003797098994255
# 33 200 loss: 0.03219802305102348
# 33 300 loss: 0.13361632823944092
# 33 acc: 0.3392
# 34 0 loss: 0.0868552103638649
# 34 100 loss: 0.1317896544933319
# 34 200 loss: 0.05642816424369812
# 34 300 loss: 0.04360516369342804
# 34 acc: 0.3323
# 35 0 loss: 0.07796430587768555
# 35 100 loss: 0.05255243927240372
# 35 200 loss: 0.07042545080184937
# 35 300 loss: 0.049678951501846313
# 35 acc: 0.3367
# 36 0 loss: 0.06822186708450317
# 36 100 loss: 0.058710016310214996
# 36 200 loss: 0.14014987647533417
# 36 300 loss: 0.08457411825656891
# 36 acc: 0.3351
# 37 0 loss: 0.2767563462257385
# 37 100 loss: 0.09446721524000168
# 37 200 loss: 0.10619304329156876
# 37 300 loss: 0.07642345130443573
# 37 acc: 0.3445
# 38 0 loss: 0.06346447765827179
# 38 100 loss: 0.02641148865222931
# 38 200 loss: 0.08356757462024689
# 38 300 loss: 0.09790317714214325
# 38 acc: 0.3432
# 39 0 loss: 0.10022857040166855
# 39 100 loss: 0.07792702317237854
# 39 200 loss: 0.015704695135354996
# 39 300 loss: 0.11118768900632858
# 39 acc: 0.3316
# 40 0 loss: 0.07642681896686554
# 40 100 loss: 0.06625737994909286
# 40 200 loss: 0.070556640625
# 40 300 loss: 0.01251917239278555
# 40 acc: 0.3379
# 41 0 loss: 0.08273995667695999
# 41 100 loss: 0.05495009943842888
# 41 200 loss: 0.09534008055925369
# 41 300 loss: 0.08666946738958359
# 41 acc: 0.3425
# 42 0 loss: 0.04884067177772522
# 42 100 loss: 0.09093677997589111
# 42 200 loss: 0.043643780052661896
# 42 300 loss: 0.12992070615291595
# 42 acc: 0.3436
# 43 0 loss: 0.09021072089672089
# 43 100 loss: 0.059569451957941055
# 43 200 loss: 0.07233915477991104
# 43 300 loss: 0.09397363662719727
# 43 acc: 0.3383
# 44 0 loss: 0.24641282856464386
# 44 100 loss: 0.07050016522407532
# 44 200 loss: 0.056658785790205
# 44 300 loss: 0.05466039851307869
# 44 acc: 0.345
# 45 0 loss: 0.050673604011535645
# 45 100 loss: 0.056339047849178314
# 45 200 loss: 0.11277655512094498
# 45 300 loss: 0.04756581038236618
# 45 acc: 0.3395
# 46 0 loss: 0.15243986248970032
# 46 100 loss: 0.050137586891651154
# 46 200 loss: 0.11051837354898453
# 46 300 loss: 0.04553702473640442
# 46 acc: 0.3367
# 47 0 loss: 0.11095094680786133
# 47 100 loss: 0.1652894914150238
# 47 200 loss: 0.07782348990440369
# 47 300 loss: 0.05710805952548981
# 47 acc: 0.3457
# 48 0 loss: 0.0649101510643959
# 48 100 loss: 0.04067564010620117
# 48 200 loss: 0.013747965916991234
# 48 300 loss: 0.1444970816373825
# 48 acc: 0.3484
# 49 0 loss: 0.022526759654283524
# 49 100 loss: 0.007221122272312641
# 49 200 loss: 0.18244332075119019
# 49 300 loss: 0.0627250075340271
# 49 acc: 0.3481