【Dive into Deep Learning |第三章线性神经网络】动手学深度学习（李沐）3.7 softmax回归的简洁实现（代码含注释）

本文链接：https://blog.csdn.net/weixin_49191101/article/details/131626460

读取数据

导包，并且将小批量大小设为256，读取数据。

import torch
from torch import nn
from d2l import torch as d2l

batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)

初始化模型参数

# PyTorch不会隐式地调整输入的形状。因此，
# 我们在线性层前定义了展平层（flatten），来调整网络输入的形状
net = nn.Sequential(nn.Flatten(), nn.Linear(784, 10)) # nn.Linear线性网络，输入784，输出10

def init_weights(m):
    if type(m) == nn.Linear:
        nn.init.normal_(m.weight, std=0.01)

net.apply(init_weights);

nn.Sequential 的参数是一个由两个模块组成的列表。第一个模块是 nn.Flatten()，它将输入的多维张量转换为一维张量，用于将图像数据展平。第二个模块是 nn.Linear(784, 10)，它定义了一个线性层，输入大小为 784（图像展平后的长度），输出大小为 10（对应于 10 个类别的预测）。
接下来，定义了一个权重初始化函数 init_weights。这个函数通过输入一个模块 m，判断模块的类型是否为线性层 (nn.Linear)。如果是线性层，就对该层的权重进行正态分布初始化，使用 nn.init.normal_ 函数，并指定标准差为 0.01。
最后，通过调用 net.apply(init_weights)，将初始化权重的操作应用到神经网络模型 net 的每个模块上。这样，在模型的初始化阶段，所有线性层的权重将会被初始化为指定的正态分布。

损失函数

loss = nn.CrossEntropyLoss(reduction='none')

nn.CrossEntropyLoss 是一个用于多类别分类任务的损失函数，常用于神经网络的训练过程中。它接受两个参数：reduction 和 weight。
reduction 参数被设置为 ‘none’，表示不对损失进行降维或求和操作。当 reduction 设置为 ‘none’ 时，nn.CrossEntropyLoss 返回一个与输入张量大小相同的损失张量，其中每个元素对应一个样本的损失值。

优化算法

使用学习率为0.1的小批量随机梯度下降作为优化算法。

trainer = torch.optim.SGD(net.parameters(), lr=0.1)

训练

num_epochs = 10 #迭代次数
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)

在这里插入图片描述