mxnet框架下超全手写字体识别—从数据预处理到网络的训练—模型及日志的保存

        Mxnet框架深度学习框架越来越受到大家的欢迎。但是如何正确的使用这一框架,很多人并不是很清楚。从训练数据的预处理,数据的生成(网络真正需要的数据格式,网络模型的保存,网络训练日志的保存,等等,虽然网上有很多的trick,但是大多数比较零散),这里,博主就从零开始,教大家训练手写字体(mnist)识别的一个完整的系统。


一、python、mxnet 如何安装。

trickwindows下使用pip安装Mxnet可能会错出,因为windows下的mxnet可能已来VC++2015或其他版本,linux下就不存在这种情况。


二、手写字体数据集如何获取。

 
import mxnet as mx
mnist = mx.test_utils.get_mnist()  # 得到手写字体数据集

    运行这行代码就可以下载到mnist数据集。mnist数据集主要包含四个压缩文件,截图如下:

说明:t10k-images-idx3-ubyte.gz:测试集图片二进制压缩文件

            t10k-labels-idx1-ubyte.gz:测试集图片对应的标签二进制压缩文件

            train-images-idx3-ubyte.gz:训练集图片二进制压缩文件

             train-labels-idx1-ubyte.gz: 训练集图片对应的标签二进制压缩文件

        经常有人问我,为什么下载下来的图片打不开,根本不知道里面的图片长什么样子如何读取,如何进行训练,我只能说,这些图片已经被转成了2进制文件,并不是原始的图片。那么这些文件里面的图片到底是如何组织的呢?,通过下面的代码您就能完全了解。

由于我之前就已经下载过这4个压缩文件,所以直接从本地读取就可以了,没有必要重复下载,并且有时候并不能完全下载下来。

train_data_path = 'mnist_data/train-images-idx3-ubyte.gz'
train_label_path = 'mnist_data/train-labels-idx1-ubyte.gz'
test_data_path = 'mnist_data/t10k-images-idx3-ubyte.gz'
test_label_path = 'mnist_data/t10k-labels-idx1-ubyte.gz'
train_label, train_data = read_data(image_url=train_data_path, label_url=train_label_path)
test_label, test_data = read_data(image_url=test_data_path, label_url=test_label_path)

print('shape of train_data:', train_data.shape)
print('shape of train_label:', train_label.shape)
print('shape of test_data:', test_data.shape)
print('shape of test_label:', test_label.shape)

输出结果:

shape of train_data: (60000, 1, 28, 28)
shape of train_label: (60000,)
shape of test_data: (10000, 1, 28, 28)
shape of test_label: (10000,)
如果大家是初次下载,运行
 
mnist = mx.test_utils.get_mnist() 

后就已经得到了一个完整的手写字体对象mnist。我们就可以直接通过下面的方式得到训练集以及测试的数据,代码如下:

train_image = mnist['train_data']
train_image_label = mnist['train_label']
test_image = mnist['test_data']
testimage_label = mnist['test_label']

三、数据如何处理。

        下载了mnist数据集,并且得到其具体的数据,该如何把这些数据转换成我们训练阶段真正需要的格式?从上面的print的信息中我们已经可以知道图片的大小已经是28×28、单通道灰度图。如果我们不对图片进行缩放的话,网络的输入应该是(batch_size, channel, height, width),所以我们需要把60000张训练集图片,10000张测试集图片转换成 (60000//batch_size)×(batch_size, channel, height, width)、(10000//batch_size)×(batch_size, channel, height, width)的迭代的形式。

          为什么mxnet里面的训练数据必须是以迭代器的形式传入的?1)简单,简单,简单!!! 2)mxnet框架中,用户是不能像tensorflow框架那样写个for循环来显示的将数据送入到网络里面。那么如何正确的使用mxnet框架提供的迭代器呢?有的时候mxnet提供的迭代器类并不能满足所有的需求,我们还需要重写这个类。

          熟悉mxnet框架的小伙伴,应该知道,mxnet框架中网络的输入主要包含两种:1)img,2)ndarray

          一般来说,对于前者我们可以很方便的使用mxnet提供的img2rec.py这个文件,将所有的图片转换成rec文件,然后将这个rec文件作为网络的输入,其实也是一个迭代器对象。然而生成rec文件耗时,并且需要很大的额外空间,但是有没有一种办法不生成rec文件呢?当然有,就是上文提到的,重写DataIter类,返回一个迭代器对象,每一次迭代都是(batch_size, channel, height, width)的完整数据快,这样就可以将数据源源不断的送入到网络里面去。完整代码如下:

class Batch(object):
    def __init__(self, data, label):
        self.data = data
        self.label = label

class Inter(mx.io.DataIter):
    def __init__(self, batch_size, train_data, train_label):
        super(Inter, self).__init__()
        self.batch_size = batch_size
        self.begin = 0
        self.index = 0
        self.train_data = train_data
        self.train_label = train_label
        self.train_count = len(train_data)
        assert len(train_data) == len(train_label), 'Error'
        assert (self.train_count >= self.batch_size) and (self.batch_size > 0), 'Error'
        self.train_batches = self.train_count // self.batch_size

    def __iter__(self):
        return self

    def reset(self):
        self.begin = 0
        self.index = 0

    def next(self):
        if self.iter_next():
            return self.getdata()
        else:
            raise StopIteration

    def __next__(self):
        return self.next()

    def iter_next(self):
        if self.begin < self.train_batches:
            return True
        else:
            return False

    def get_batch_images_labels(self):
        data = self.train_data[self.index:self.index + self.batch_size, :, :, :]
        label = self.train_label[self.index:self.index + self.batch_size]
        return data, label

    def getdata(self):
        images, labels = self.get_batch_images_labels()  # 顺序的得到数据
        data_all = [mx.nd.array(images)]
        label_all = [mx.nd.array(labels)]
        self.index += self.batch_size
        self.begin += 1
        return Batch(data_all, label_all)

    def getlabel(self):
        pass

    def getindex(self):
        return None

    def getpad(self):
        pass

Inter这个类就简单的重写了原生的mxnet迭代器类DataIter。从我写的代码中就可以看出这个迭代器类每次都会返回一个Batch对象,数据(data)和标签(label),其中data的shape为:(batch_size, channel, height, width),label的shape为(batch_size,)请注意迭代器里面的reset方法。

四、神经网络的构建

             mxnet框架里面有两个非常重要的包:symbol和gluon。我们完全可以通过这两个组件构建神经网络。当然也完全可以提通过ndarray对象构建神经网络。这里我会一一给出代码。

首先我给出网络上一张很经典的Lenet-5的网络结构图:


trick:有没有发现网络图片的原始输入是32×32,而我们的图片矩阵却是28×28的?所以我在具体实现的时候稍微调整了下网络结构。

        1、使用symbol构建Lenet-5网络结构:

def get_net(class_num, bn_mom=0.99, filter_list=(6, 16)):
    data = mx.sym.Variable('data')
    imput = mx.sym.BatchNorm(data=data, fix_gamma=True, eps=1e-5, momentum=bn_mom, name='bn_imput')  # 批量标准化
    # layer_1 卷积
    layer_1 = mx.sym.Convolution(data=imput, num_filter=filter_list[0], kernel=(5, 5), stride=(2, 2), pad=(2, 2),
                                 no_bias=False, name="conv_layer_1")
    bn_layer_1 = mx.sym.BatchNorm(data=layer_1, fix_gamma=False, eps=1e-5, momentum=bn_mom, name='bn_layer_1')
    a_bn_layer_1 = mx.sym.Activation(data=bn_layer_1, act_type='relu', name='relu_a_bn_layer_1')
    # layer_2 卷积
    bn_layer_2 = mx.sym.BatchNorm(data=a_bn_layer_1, fix_gamma=True, eps=1e-5, momentum=bn_mom, name='bn_layer_2')
    conv_layer_2 = mx.sym.Convolution(data=bn_layer_2, num_filter=filter_list[1], kernel=(5, 5), stride=(1, 1),
                                      pad=(0, 0), no_bias=False, name="conv_layer_2")

    bn_layer_2_1 = mx.sym.BatchNorm(data=conv_layer_2, fix_gamma=False, eps=1e-5, momentum=bn_mom, name='bn_layer_2_1')
    a_bn_layer_2 = mx.sym.Activation(data=bn_layer_2_1, act_type='relu', name='relu_a_a_bn_layer_2')

    # 下采样层
    pooling_layer_2 = mx.symbol.Pooling(data=a_bn_layer_2, kernel=(5, 5), stride=(2, 2), pad=(2, 2), pool_type='max',
                                        name='pooling_layer_2')
    # 全连接层
    fc = mx.symbol.FullyConnected(data=pooling_layer_2, num_hidden=120, flatten=True, no_bias=False, name='fc')
    bn1_fc = mx.sym.BatchNorm(data=fc, fix_gamma=False, eps=1e-5, momentum=bn_mom, name='bn1_fc')
    fc1 = mx.symbol.FullyConnected(data=bn1_fc, num_hidden=84, flatten=True, no_bias=False, name='fc1')
    bn1_fc1 = mx.sym.BatchNorm(data=fc1, fix_gamma=False, eps=1e-5, momentum=bn_mom, name='bn1_fc1')
    fc2 = mx.symbol.FullyConnected(data=bn1_fc1, num_hidden=class_num, flatten=True, no_bias=False, name='fc2')
    bn1_fc2 = mx.sym.BatchNorm(data=fc2, fix_gamma=False, eps=1e-5, momentum=bn_mom, name='bn1_fc2')
    return mx.symbol.SoftmaxOutput(data=bn1_fc2, name='softmax')

        2、使用gluon组件构建Lenet-5网络结构:

def create_net():
    net = nn.Sequential()
    with net.name_scope(): 
        net.add(
            nn.BatchNorm(epsilon=1e-5, momentum=0.9),
            nn.Conv2D(channels=6, kernel_size=5, strides=2, padding=2, activation='relu'),
            nn.BatchNorm(epsilon=1e-5, momentum=0.9),
            nn.Conv2D(channels=16, kernel_size=5, strides=1, padding=0, activation='relu'),
            nn.BatchNorm(epsilon=1e-5, momentum=0.9),
            nn.AvgPool2D(pool_size=2, strides=2, padding=2),
            nn.Flatten(),
            nn.BatchNorm(epsilon=1e-5, momentum=0.9),
            nn.Dense(120, activation='relu'),
            nn.BatchNorm(epsilon=1e-5, momentum=0.9),
            nn.Dense(84, activation='relu'),
            nn.BatchNorm(epsilon=1e-5, momentum=0.9),
            nn.Dense(10)
        )
    return net

        3、使用mxnet的ndarray(区别于 numpy的 array)构建Lenet-5网络结构:

ctx = mx.cpu()  # 计算设备
# 输出特征数目 = 6, 卷积核 = (5,5)----------第一个卷积层
W1 = nd.random_normal(shape=(6, 1, 5, 5), scale=.1, ctx=ctx)
b1 = nd.zeros(W1.shape[0], ctx=ctx)

# 特征数目 = 16, 卷积核 = (5,5)----------第二个卷积层
W2 = nd.random_normal(shape=(16, 6, 3, 3), scale=.1, ctx=ctx)
b2 = nd.zeros(W2.shape[0], ctx=ctx)

# 第一个全链接层
W3 = nd.random_normal(shape=(400, 120), scale=.1, ctx=ctx)
b3 = nd.zeros(W3.shape[1], ctx=ctx)

# 第二个全链接层
W4 = nd.random_normal(shape=(W3.shape[1], 84), scale=.1, ctx=ctx)
b4 = nd.zeros(W4.shape[1], ctx=ctx)

# 第三个全链接层
W5 = nd.random_normal(shape=(W4.shape[1], 10), scale=.1, ctx=ctx)
b5 = nd.zeros(W5.shape[1], ctx=ctx)

params = [W1, b1, W2, b2, W3, b3, W4, b4, W5, b5]

for param in params:
    param.attach_grad()


def net(X):
    X = X.as_in_context(W1.context)

    # 批量归一化
    bn_X = nd.BatchNorm_v1(data=X, fix_gamma=True, eps=1e-5, output_mean_var=0.99, name='bn_X')

    # 第一层卷积
    h1_conv = nd.Convolution(data=bn_X, weight=W1, bias=b1, kernel=W1.shape[2:], num_filter=W1.shape[0], name='h1_conv')
    # 批量归一化
    bn_h1_conv = mx.sym.BatchNorm(data=h1_conv, fix_gamma=False, eps=1e-5, momentum=0.99, name='bn_h1_conv')
    h1_activation = nd.relu(bn_h1_conv)

    # 第二层卷集
    # 批量归一化
    bn_h1_conv2 = nd.BatchNorm_v1(data=h1_activation, fix_gamma=False, eps=1e-5, momentum=0.99, name='bn_h1_conv2')
    h1_conv2 = nd.Convolution(data=bn_h1_conv2, weight=W2, bias=b2, kernel=W1.shape[2:], num_filter=W1.shape[0],
                              name="h1_conv2")

    bn_h1_conv2 = nd.BatchNorm_v1(data=h1_conv2, fix_gamma=False, eps=1e-5, momentum=0.99, name='bn_h1_conv2')
    h2_activation = nd.relu(bn_h1_conv2)

    # 下采样层
    # 下采样层
    pooling_layer_2 = mx.symbol.Pooling(data=h2_activation, kernel=(5, 5), stride=(2, 2), pad=(2, 2), pool_type='max',
                                        name='pooling_layer_2')  # 16 *5 *5 flatten =

    # flatten
    fla = nd.flatten(data=pooling_layer_2, name='fla')
    # 全链接层---1
    fullcollect_layer = nd.dot(fla, W3) + b3
    bn_fullcollect_layer = mx.sym.BatchNorm(data=fullcollect_layer, fix_gamma=False, eps=1e-5, momentum=0.99,
                                            name='bn_fullcollect_layer')
    relu_bn_fullcollect_layer = nd.relu(data=bn_fullcollect_layer)

    # 全链接层-2
    fullcollect_layer_2 = nd.dot(relu_bn_fullcollect_layer, W4) + b4
    bn_fullcollect_layer_2 = mx.sym.BatchNorm(data=fullcollect_layer_2, fix_gamma=False, eps=1e-5, momentum=0.99,
                                            name='bn_fullcollect_layer_2')
    relu_bn_fullcollect_layer_2 = nd.relu(data=bn_fullcollect_layer_2)

    # 全链接层3
    fullcollect_layer_3 = nd.dot(relu_bn_fullcollect_layer_2, W5) + b5
    bn_fullcollect_layer_3 = mx.sym.BatchNorm(data=fullcollect_layer_3, fix_gamma=False, eps=1e-5, momentum=0.99,
                                              name='bn_fullcollect_layer_3')
    relu_bn_fullcollect_layer_3 = nd.relu(data=bn_fullcollect_layer_3)

    print('网络结构:')
    print('第一个卷积层:', h1_activation.shape)
    print('第二个卷积层:', h2_activation.shape)
    print('下采样层:', pooling_layer_2.shape)
    print('第一个全链接层:', relu_bn_fullcollect_layer.shape)
    print('第二个全链接层:', relu_bn_fullcollect_layer_2.shape)
    print('输出层:', relu_bn_fullcollect_layer_3.shape)
    return relu_bn_fullcollect_layer_3

五、训练网络模型

        数据处理了,网络模型构建好了,就可以将数据喂到网络里面去,训练网络模型了。

        trick:这里需要说明一下,使用不同的组件构建的网络模型,训练的时候代码可能有点差异。这里分别针对不同组件构建的网络该如何编制训练程序进行说明。

        1、如果使用上面提到的symbol组件构建的网络,那么我们就可以编制下面的程序,训练网络。

           1)设置训练日志输出格式:

# 检查路径
Util.check_all_path([config.saved_model_path, config.train_test_log_save_path.replace('/resnet_log.log', '')])
logger = logging.getLogger()
logging.basicConfig(level=logging.INFO,
                    format='%(message)s',
                    datefmt='%a, %d %b %Y %H:%M:%S',
                    filename=config.train_test_log_save_path,
                    filemode='w')

            2)获取数据迭代器对象,代码:

train_data, train_label, test_data, test_label = get_all_avaliable_data(config.train_data_path,
                                                                        config.train_label_path,
                                                                        config.test_data_path,
                                                                        config.test_label_path)
data_train = Inter(config.batch_size, train_data, train_label)  # 获取训练集的迭代器对象
_eval_data = Inter(config.batch_size*2, test_data, test_label)  # 获取测试集的迭代器对象

            3)训练:

        

_eval_data = mx.sym.Variable('eval_data:')
softmax_out = get_net(class_num=10, bn_mom=0.99, filter_list=[6, 16])
model = mx.mod.Module(symbol=softmax_out,
                      context=mx.cpu(),
                      data_names=['data'],
                      label_names=['softmax_label'])
model.fit(data_train,
          eval_data=_eval_data,
          optimizer='sgd',
          initializer=mx.init.Xavier(rnd_type='gaussian', factor_type='in', magnitude=2),
          eval_metric=['acc', 'ce'],
          optimizer_params={
 'learning_rate': config.learning_rate, 'momentum': config.momentum},
          batch_end_callback=mx.callback.Speedometer(config.batch_size, 1),
          epoch_end_callback=mx.callback.do_checkpoint(config.saved_model_path),
          num_epoch=config.num_epoch)

           4) 这部分的完整代码:

import logging
import mxnet as mx
from net import get_net
from tool import Inter, Test
from util import Util
import config
from lodad_data import get_all_avaliable_data

# 检查路径
Util.check_all_path([config.saved_model_path, config.train_test_log_save_path.replace('/resnet_log.log', '')])
logger = logging.getLogger()
logging.basicConfig(level=logging.INFO,
                    format='%(message)s',
                    datefmt='%a, %d %b %Y %H:%M:%S',
                    filename=config.train_test_log_save_path,
                    filemode='w')

if __name__ == '__main__':
    """
      By nxg  read only  no copy and no broadcast......
    """
    _eval_data = mx.sym.Variable('eval_data:')
    softmax_out = get_net(class_num=10, bn_mom=0.99, filter_list=[6, 16])
    model = mx.mod.Module(symbol=softmax_out,
                          context=mx.cpu(),
                          data_names=['data'],
                          label_names=['softmax_label'])

    train_data, train_label, test_data, test_label = get_all_avaliable_data(config.train_data_path,
                                                                            config.train_label_path,
                                                                            config.test_data_path,
                                                                            config.test_label_path)
    data_train = Inter(config.batch_size, train_data, train_label)  # 获取训练集的迭代器对象
    _eval_data = Inter(config.batch_size*2, test_data, test_label)  # 获取测试集的迭代器对象

    model.fit(data_train,
              eval_data=_eval_data,
              optimizer='sgd',
              initializer=mx.init.Xavier(rnd_type='gaussian', factor_type='in', magnitude=2),
              eval_metric=['acc', 'ce'],
              optimizer_params={
 'learning_rate': config.learning_rate, 'momentum': config.momentum},
              batch_end_callback=mx.callback.Speedometer(config.batch_size, 1),
              epoch_end_callback=mx.callback.do_checkpoint(config.saved_model_path),
              num_epoch=config.num_epoch)

            2、如果式样上面提到的使用gluon组件构建的神经网络,那么训练网络时候的完整代码如下:

def accuracy(output, label):
    return nd.mean(output.argmax(axis=1) == label).asscalar()


def evaluate_accuracy(_test_data, net):
    acc = 0.
    for test_data_label_data_names_label_names in _test_data:
        test_data = test_data_label_data_names_label_names.data
        test_label = test_data_label_data_names_label_names.label
        data = test_data[0].as_in_context(ctx)
        label = test_label[0].as_in_context(ctx)

        output = net(data)
        label = label.as_in_context(ctx)
        acc += accuracy(output, label)
    return acc / eval_data_batch_count


def main():

    train_data, train_label, test_data, test_label = get_all_avaliable_data(config.train_data_path,
                                                                            config.train_label_path,
                                                                            config.test_data_path,
                                                                            config.test_label_path)
    data_train = Inter(config.batch_size, train_data, train_label)
    _eval_data = Inter(config.batch_size, test_data, test_label)

    global train_data_batch_count
    global eval_data_batch_count
    global train_step
    train_step = 0
    train_data_batch_count = len(train_data) // config.batch_size  # 937
    eval_data_batch_count = len(test_data) // config.batch_size  # 156
    # 保存日志
    log = open(file='train_test_log/resnet_log.log', mode='w')
    softmax_cross_entropy_loss = gluon.loss.SoftmaxCrossEntropyLoss()
    net = create_net()
    net.initialize(ctx=ctx)  # 初始化网络参数
    trainer = gluon.Trainer(net.collect_params(), 'sgd', {
 "learning_rate": 0.5})

    for epoch in range(5):
        all_train_loss = 0.
        all_train_acc = 0.
        data_train.reset()  # 这句话如果不要,那么整个数据集只会迭代一次
        _eval_data.reset()  # 这句话如果不要,那么整个数据集只会迭代一次
        for data_label_data_names_label_names in data_train:
            train_step += 1
            data = data_label_data_names_label_names.data
            label = data_label_data_names_label_names.label
            data = data[0].as_in_context(ctx)  # 在何种计算设备上实施计算
            label = label[0].as_in_context(ctx)
            with autograd.record():
                output = net(data)
                loss = softmax_cross_entropy_loss(output, label)
            loss.backward()
            trainer.step(config.batch_size)

            train_loss = nd.mean(loss).asscalar()
            train_acc  = accuracy(output, label)
            all_train_loss += train_loss
            all_train_acc += train_acc

            log.writelines("Epoch:%d, train_step: %d, loss: %f, Train_acc: %f \n" %
                           (epoch, train_step, train_loss, train_acc))
        test_acc = evaluate_accuracy(_eval_data, net)
        log.writelines("\n\nEpoch:%d, avg_train_loss: %f, avg_train_acc: %f, Test_acc: %f \n" %
              (epoch, all_train_loss / train_data_batch_count, all_train_acc / train_data_batch_count, test_acc))


if __name__ == '__main__':
    main()
         可能上面贴出的代码中的某些工具函数我并没有给全,大家可以到我的github上去下载,也可以留言,我会把完整的代码分享给大家。

 6、本地保存的训练日志:

Epoch:0, train_step: 1, loss: 2.417782, Train_acc: 0.093750 
Epoch:0, train_step: 2, loss: 2.147448, Train_acc: 0.218750 
Epoch:0, train_step: 3, loss: 2.077140, Train_acc: 0.406250 
Epoch:0, train_step: 4, loss: 1.847961, Train_acc: 0.437500 
Epoch:0, train_step: 5, loss: 1.075216, Train_acc: 0.671875 
Epoch:0, train_step: 6, loss: 0.592741, Train_acc: 0.859375 
Epoch:0, train_step: 7, loss: 0.643913, Train_acc: 0.828125 
Epoch:0, train_step: 8, loss: 0.837896, Train_acc: 0.796875 
Epoch:0, train_step: 9, loss: 0.582398, Train_acc: 0.859375 
Epoch:0, train_step: 10, loss: 0.750824, Train_acc: 0.765625 
Epoch:0, train_step: 11, loss: 0.532329, Train_acc: 0.781250 
Epoch:0, train_step: 12, loss: 0.583528, Train_acc: 0.796875 
Epoch:0, train_step: 13, loss: 0.422033, Train_acc: 0.921875 
Epoch:0, train_step: 14, loss: 0.829014, Train_acc: 0.718750 
Epoch:0, train_step: 15, loss: 0.643326, Train_acc: 0.812500 
Epoch:0, train_step: 16, loss: 0.667152, Train_acc: 0.828125 
Epoch:0, train_step: 17, loss: 0.743936, Train_acc: 0.796875 
Epoch:0, train_step: 18, loss: 0.640609, Train_acc: 0.718750 
Epoch:0, train_step: 19, loss: 0.578947, Train_acc: 0.843750 
Epoch:0, train_step: 20, loss: 0.678622, Train_acc: 0.796875 
Epoch:0, train_step: 21, loss: 0.659916, Train_acc: 0.781250 
Epoch:0, train_step: 22, loss: 0.886372, Train_acc: 0.703125 
Epoch:0, train_step: 23, loss: 0.498017, Train_acc: 0.812500 
Epoch:0, train_step: 24, loss: 0.339886, Train_acc: 0.890625 
Epoch:0, train_step: 25, loss: 0.383869, Train_acc: 0.890625 
Epoch:0, train_step: 26, loss: 0.352800, Train_acc: 0.890625 
Epoch:0, train_step: 27, loss: 0.235351, Train_acc: 0.921875 
Epoch:0, train_step: 28, loss: 0.335911, Train_acc: 0.906250 
Epoch:0, train_step: 29, loss: 0.321678, Train_acc: 0.906250 
Epoch:0, train_step: 30, loss: 0.214269, Train_acc: 0.937500 
Epoch:0, train_step: 31, loss: 0.194405, Train_acc: 0.937500 
Epoch:0, train_step: 32, loss: 0.229423, Train_acc: 0.937500 
Epoch:0, train_step: 33, loss: 0.357825, Train_acc: 0.921875 
Epoch:0, train_step: 34, loss: 0.093697, Train_acc: 0.984375 
Epoch:0, train_step: 35, loss: 0.236372, Train_acc: 0.906250 
Epoch:0, train_step: 36, loss: 0.171640, Train_acc: 0.921875 
Epoch:0, train_step: 37, loss: 0.760929, Train_acc: 0.828125 
Epoch:0, train_step: 38, loss: 0.425227, Train_acc: 0.890625 
Epoch:0, train_step: 39, loss: 0.419191, Train_acc: 0.875000 
Epoch:0, train_step: 40, loss: 0.206767, Train_acc: 0.906250 
Epoch:0, train_step: 
  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值