Numpy批次训练

前言

  在上一章《Pytorch神经网络》中说过,我们将探究优化器。但是,在进入优化器章节之前,我们还需要聊聊批次训练。代码中出现的starknn是来自《Numpy神经网络》中的代码。

import numpy as np
import time
from sklearn.datasets import make_moons
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import starknn#导入自己写的模块
#初始化模型参数
nn_cfg = [{"in_features": 2,  "out_features": 25, "activation": "relu"},#(2,25)
          {"in_features": 25,  "out_features": 50, "activation": "relu"},#(25,50)
          {"in_features": 50,  "out_features": 50, "activation": "relu"},#(50,50)
          {"in_features": 50,  "out_features": 25, "activation": "relu"},#(50,25)
          {"in_features": 25,  "out_features": 2, "activation": "sigmoid"}]#(25,2)
#准备数据
X, y = make_moons(n_samples = 1000, noise=0.3)#数据和标签
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)#拆分数据集,训练集:测试集=9:1
y_train = starknn.idx2onehot(y_train)#标签转化,独热编码

一、全部数据训练

start = time.time()
params, acc_history, cost_history = starknn.train(x_train, y_train, nn_cfg, 10000, 0.01)
end = time.time()
print('The full data time is {:.2f} second.'.format(end-start))
#测试
y_hat, _ = starknn.forward_full_layer(x_test, params, nn_cfg)
test_accuracy = starknn.calc_accuracy(y_hat, y_test, train=False)
print('The accuracy of this test dataset is {}%.'.format(test_accuracy * 100))
The full data time is 48.71 second.
The accuracy of this test dataset is 93.0%.

  上面的训练方法是,一次性把所有的数据都拿给神经网络训练,其实不太利于模型对数据的学习。熟话说得好啊,饭吃七分饱,话说三分满,所以我们尝试使用批次训练。

batch_size = 32#批次大小

二、批次训练(1)

#批次数据训练
def batch_train(X, Y, nn_cfg, epochs, learning_rate, batch_size, train=True):
    params = starknn.init_layers(nn_cfg, 2)
    num_batch = X.shape[0] // batch_size#数据数量整除批次大小
    acc_history = []
    cost_history = []
    for i in range(epochs):
        offset_idx = i % num_batch#一批次训练
        X_batch = X[offset_idx: (offset_idx + 1) * batch_size, :]
        Y_batch = Y[offset_idx: (offset_idx + 1) * batch_size, :]
        #前向传播
        Y_hat, memory = starknn.forward_full_layer(X_batch, params, nn_cfg)
        #计算准确率
        accuracy = starknn.calc_accuracy(Y_hat, Y_batch, train=train)
        #计算损失
        cost = starknn.calc_cost(Y_hat, Y_batch)
        acc_history.append(accuracy)
        cost_history.append(cost)
        #反向传播
        grads = starknn.full_backward_propagation(Y_hat, Y_batch, memory, params, nn_cfg)
        #更新参数
        params = starknn.update(params, grads, nn_cfg, learning_rate)    
    return params, acc_history, cost_history
start = time.time()
params, acc_history, cost_history = batch_train(x_train, y_train, nn_cfg, 10000, 0.01, batch_size)
end = time.time()
print('The mini batch data time is {:.2f} second.'.format(end-start))
#测试
y_hat, _ = starknn.forward_full_layer(x_test, params, nn_cfg)
test_accuracy = starknn.calc_accuracy(y_hat, y_test, train=False)
print('The accuracy of this test dataset is {}%.'.format(test_accuracy * 100))
The mini batch data time is 32.16 second.
The accuracy of this test dataset is 91.0%.

三、批次训练(2)

def another_batch(X, Y, nn_cfg, epochs, learning_rate, batch_size, train=True):
    params = starknn.init_layers(nn_cfg, 2)
    num_batch = X.shape[0] // batch_size#数据数量整除批次大小
    acc_history = []
    cost_history = []
    for epoch in range(epochs):
        for offset_idx in range(num_batch):
            X_batch = X[offset_idx: (offset_idx + 1) * batch_size, :]
            Y_batch = Y[offset_idx: (offset_idx + 1) * batch_size, :]
            #前向传播
            Y_hat, memory = starknn.forward_full_layer(X_batch, params, nn_cfg)
            #计算准确率
            accuracy = starknn.calc_accuracy(Y_hat, Y_batch, train=train)
            #计算损失
            cost = starknn.calc_cost(Y_hat, Y_batch)
            acc_history.append(accuracy)
            cost_history.append(cost)
            #反向传播
            grads = starknn.full_backward_propagation(Y_hat, Y_batch, memory, params, nn_cfg)
            #更新参数
            params = starknn.update(params, grads, nn_cfg, learning_rate)  
    return params, acc_history, cost_history
start = time.time()
params, acc_history, cost_history = another_batch(x_train, y_train, nn_cfg, 10000//(x_train.shape[0]//batch_size), 0.01, batch_size)
end = time.time()
print('The another batch data time is {:.2f} second.'.format(end-start))
#测试
y_hat, _ = starknn.forward_full_layer(x_test, params, nn_cfg)
test_accuracy = starknn.calc_accuracy(y_hat, y_test, train=False)
print('The accuracy of this test dataset is {}%.'.format(test_accuracy * 100))
The another batch data time is 34.70 second.
The accuracy of this test dataset is 93.0%.

  (1)和(2)中的批次训练是一致的,只是换了一种写法。注意在(2)中传入的epochs就不再是10000,而是10000整除num_batch。在Pytorch中我一般采用第(2)种方式。

总结

  3种方法的结果都比较相近。批次训练的方法使用的时间要少一些,因为每次进行的矩阵(batch_size,2)要比全数据运算的矩阵(900, 2)计算量要小很多。每个epoch计算量较小,计算的速度就要快点。ok,准备就绪。下一章节就开始聊聊优化器,敬请期待。

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值