HBU_神经网络与深度学习实验6 前馈神经网络：自动梯度计算及优化问题

ZodiAc7

已于 2022-10-20 11:36:52 修改

阅读量520

点赞数 1

于 2022-10-07 23:41:52 首次发布

本文链接：https://blog.csdn.net/m0_61227501/article/details/127200951

版权

写在前面的一些内容

本文为HBU_神经网络与深度学习实验（2022年秋）实验6的实验报告，此文的基本内容参照 [1]Github/前馈神经网络-上.ipynb，检索时请按对应序号进行检索。
本实验报告参考了 HBU-NNDL 实验五前馈神经网络（2）自动梯度计算 & 优化问题 by 不是蒋承翰的部分内容。
本实验编程语言为Python 3.10，使用Pycharm进行编程。
本实验报告目录标题级别顺序：一、 1. (1)
水平有限，难免有误，如有错漏之处敬请指正。

一、自动梯度计算和预定义算子

虽然我们能够通过模块化的方式比较好地对神经网络进行组装，但是每个模块的梯度计算过程仍然十分繁琐且容易出错。在深度学习框架中，已经封装了自动梯度计算的功能，我们只需要聚焦模型架构，不再需要耗费精力进行计算梯度。

1. 利用预定义算子重新实现前馈神经网络

下面我们使用Pytorch的预定义算子来重新实现二分类任务。
主要使用到的预定义算子为torch.nn.Linear：

class torch.nn.Linear(in_features, out_features, weight_attr=None, bias_attr=None, name=None)

torch.nn.Linear算子可以接受一个形状为[batch_size,∗,in_features]的输入张量，其中"∗"表示张量中可以有任意的其它额外维度，并计算它与形状为[in_features, out_features]的权重矩阵的乘积，然后生成形状为[batch_size,∗,out_features]的输出张量。torch.nn.Linear算子默认有偏置参数，可以通过bias_attr=False设置不带偏置。
代码实现如下：

import torch.nn
import torch.nn.functional as F
from torch.nn.init import constant_, normal_

class Model_MLP_L2_V2(torch.nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(Model_MLP_L2_V2, self).__init__()
        # 使用'torch.nn.Linear'定义线性层。
        # 其中第一个参数（in_features）为线性层输入维度；第二个参数（out_features）为线性层输出维度
        # weight_attr为权重参数属性，这里使用'torch.nn.init.normal_'进行随机高斯分布初始化
        # bias_attr为偏置参数属性，这里使用'torch.nn.init.constant_'进行常量初始化
        self.fc1 = torch.nn.Linear(input_size, hidden_size)
        normal_(tensor=self.fc1.weight, mean=0., std=1.)
        constant_(tensor=self.fc1.bias, val=0.0)
        self.fc2 = torch.nn.Linear(hidden_size, output_size)
        normal_(tensor=self.fc2.weight, mean=0., std=1.)
        constant_(tensor=self.fc2.bias, val=0.0)
        # 使用'torch.nn.functional.sigmoid'定义 Logistic 激活函数
        self.act_fn = F.sigmoid

    # 前向计算
    def forward(self, inputs):
        z1 = self.fc1(inputs)
        a1 = self.act_fn(z1)
        z2 = self.fc2(a1)
        a2 = self.act_fn(z2)
        return a2

2. 完善Runner类

基于上一节实现的 RunnerV2_1 类，本节的 RunnerV2_2 类在训练过程中使用自动梯度计算；模型保存时，使用state_dict方法获取模型参数；模型加载时，使用set_state_dict方法加载模型参数。

import torch

class RunnerV2_2(object):
    def __init__(self, model, optimizer, metric, loss_fn, **kwargs):
        self.model = model
        self.optimizer = optimizer
        self.loss_fn = loss_fn
        self.metric = metric

        # 记录训练过程中的评估指标变化情况
        self.train_scores = []
        self.dev_scores = []

        # 记录训练过程中的评价指标变化情况
        self.train_loss = []
        self.dev_loss = []

    def train(self, train_set, dev_set, **kwargs):
        # 将模型切换为训练模式
        self.model.train()

        # 传入训练轮数，如果没有传入值则默认为0
        num_epochs = kwargs.get("num_epochs", 0)
        # 传入log打印频率，如果没有传入值则默认为100
        log_epochs = kwargs.get("log_epochs", 100)
        # 传入模型保存路径，如果没有传入值则默认为"best_model.pdparams"
        save_path = kwargs.get("save_path", "best_model.pdparams")

        # log打印函数，如果没有传入则默认为"None"
        custom_print_log = kwargs.get("custom_print_log", None)

        # 记录全局最优指标
        best_score = 0
        # 进行num_epochs轮训练
        for epoch in range(num_epochs):
            X, y = train_set
            # 获取模型预测
            logits = self.model(X)
            # 计算交叉熵损失
            trn_loss = self.loss_fn(logits, y)
            self.train_loss.append(trn_loss.item())
            # 计算评估指标
            trn_score = self.metric(logits, y).item()
            self.train_scores.append(trn_score)

            # 自动计算参数梯度
            trn_loss.backward()
            if custom_print_log is not None:
                # 打印每一层的梯度
                custom_print_log(self)

            # 参数更新
            self.optimizer.step()
            # 清空梯度
            self.optimizer.zero_grad()

            dev_score, dev_loss = self.evaluate(dev_set)
            # 如果当前指标为最优指标，保存该模型
            if dev_score > best_score:
                self.save_model(save_path)
                print(f"[Evaluate] best accuracy performence has been updated: {
     best_score:.5f} --> {
     dev_score:.5f}")
                best_score = dev_score

            if log_epochs and epoch % log_epochs == 0:
                print(f"[Train] epoch: {
     epoch}/{
     num_epochs}, loss: {
     trn_loss.item()}")

    # 模型评估阶段，使用'torch.no_grad()'控制不计算和存储梯度
    @torch.no_grad()
    def evaluate(self, data_set):
        # 将模型切换为评估模式
        self.model.eval()

        X, y = data_set
        # 计算模型输出
        logits = self.model(X)
        # 计算损失函数
        loss = self.loss_fn(logits, y).item()
        self.dev_loss.append(loss)
        # 计算评估指标
        score = self.metric(logits, y).item()
        self.dev_scores.append(score)
        return score, loss

    # 模型测试阶段，使用'torch.no_grad()'控制不计算和存储梯度
    @torch.no_grad()
    def predict(self, X):
        # 将模型切换为评估模式
        self.model.eval()
        return self.model(X)

    # 使用'model.state_dict()'获取模型参数，并进行保存
    def save_model(self, saved_path):
        torch.save(self.model.state_dict(), saved_path)

    # 使用'model.set_state_dict'加载模型参数
    def load_model(self, model_path):
        state_dict = torch.load(model_path)
        self.model.load_state_dict(state_dict)

3. 模型训练

实例化RunnerV2类，并传入训练配置，代码实现如下：

from metric import accuracy

# 设置模型
input_size = 2
hidden_size = 5
output_size = 1
model = Model_MLP_L2_V2(input_size=input_size, hidden_size=hidden_size, output_size=output_size)

# 设置损失函数
loss_fn = F.binary_cross_entropy

# 设置优化器
learning_rate = 0.2
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# 设置评价指标
metric = accuracy

# 其他参数
epoch_num = 1000
saved_path = 'best_model.pdparams'

# 实例化RunnerV2类，并传入训练配置
runner = RunnerV2_2(model, optimizer, metric, loss_fn)

runner.train([X_train, y_train], [X_dev, y_dev], num_epochs=epoch_num, log_epochs=50, save_path="best_model.pdparams")

X_train, y_train, X_dev, y_dev部分的代码见上一个实验。

代码执行结果：

[Evaluate] best accuracy performence has been updated: 0.00000 --> 0.51875
[Train] epoch: 0/1000, loss: 0.8497516512870789
[Evaluate] best accuracy performence has been updated: 0.51875 --> 0.53750
[Evaluate] best accuracy performence has been updated: 0.53750 --> 0.56875
[Evaluate] best accuracy performence has been updated: 0.56875 --> 0.58125
[Evaluate] best accuracy performence has been updated: 0.58125 --> 0.58750
[Evaluate] best accuracy performence has been updated: 0.58750 --> 0.59375
[Evaluate] best accuracy performence has been updated: 0.59375 --> 0.60000
[Evaluate] best accuracy performence has been updated: 0.60000 --> 0.61250
[Evaluate] best accuracy performence has been updated: 0.61250 --> 0.61875
[Evaluate] best accuracy performence has been updated: 0.61875 --> 0.62500
[Evaluate] best accuracy performence has been updated: 0.62500 --> 0.63125
[Evaluate] best accuracy performence has been updated: 0.63125 --> 0.65000
[Evaluate] best accuracy performence has been updated: 0.65000 --> 0.65625
[Evaluate] best accuracy performence has been updated: 0.65625 --> 0.66250
[Evaluate] best accuracy performence has been updated: 0.66250 --> 0.66875
[Evaluate] best accuracy performence has b