动手学深度学习图像分类数据集(三) softmax回归的简洁实现

最新推荐文章于 2024-07-18 21:50:21 发布

Joker-Tong

最新推荐文章于 2024-07-18 21:50:21 发布

阅读量681

点赞数

分类专栏：深度学习文章标签：深度学习机器学习神经网络

本文链接：https://blog.csdn.net/Weary_PJ/article/details/113791130

版权

深度学习专栏收录该内容

45 篇文章 15 订阅

订阅专栏

动手学深度学习图像分类数据集(三) softmax回归的简洁实现

动手学深度学习图像分类数据系列:

本文的内容是介绍如何通过pytorcj简洁实现sofrmax回归完成对Fashion-MNIST的图像分类

资源均可去文末下载 (d2lzh)

导入需要的包

import torch
from torch import nn
from torch.nn import init
import d2lzh as d2l

读取数据集

batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)

定义和初始化模型

之前手写实现softmax的时候
它的输出层是一个全连接层
所以定义模型的时候可以使用nn.Linear(num_inputs, num_outputs)来代替线性层

因为前面我们数据返回的每个batch样本x 的形状为(batch_size, 1, 28, 28), 所以我们要先
用view() 将x 的形状转换成(batch_size, 784)才送入全连接层。
也就是用x.view(x.shape[0], -1)

x.shape[0] 代表 batch_size
-1 代表 1*28*28=784

# 定义和初始化模型

num_inputs = 784
num_outputs = 10


class LinearNet(nn.Module):
    def __init__(self, num_inputs, num_outputs):
        super(LinearNet, self).__init__()
        self.linear = nn.Linear(num_inputs, num_outputs)

    # x shape: (batch, 1, 28, 28)
    def forward(self, x):
        y = self.linear(x.view(x.shape[0], -1))
        return y


net = LinearNet(num_inputs, num_outputs)

书上把定义模型写的更加的简单
它将把x的形状转换另外定义成一个类,方便后续使用

FlattenLayer: 改变x的形状为 (batchsize, -1)

from collections import OrderedDict
class FlattenLayer(nn.Module):
    def __init__(self):
        super(FlattenLayer, self).__init__()

    # x shape: (batch, *, *, ...)
    def forward(self, x):
        return x.view(x.shape[0], -1)


net = nn.Sequential(
    OrderedDict([
        ('flatten', FlattenLayer()),
        ('linear', nn.Linear(num_inputs, num_outputs))
    ])
)

这里稍微记录一下Sequential与OrderedDict结合的用法

下面的写法相当于系统自动为每一层命名

# Example of using Sequential
model = nn.Sequential(
            nn.Conv2d(1,20,5),
            nn.ReLU(),
            nn.Conv2d(20,64,5),
            nn.ReLU()
          )

使用了OrderedDict相当于自己为每一层命名

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

初始化权重参数

PyTorch在init模块中提供了多种参数初始化方法。
这里的init 是initializer 的缩写形式。
我们通过init.normal_ 将权重参数每个元素初始化为随机采样于均值为0、标准差为0.01的正态分布。偏差参数默认会初始化为0。

# 初始化权重参数
init.normal_(net.linear.weight, mean=0, std=0.01)
init.constant_(net.linear.bias, val=0)

损失函数 CrossEntropyLoss()

CrossEntropyLoss()包括了softma运算和交叉熵损失计算,它的数值稳定性更高

loss = nn.CrossEntropyLoss()

优化器

optimizer = torch.optim.SGD(net.parameters(), lr=0.1)

训练模型

num_epochs = 5
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, optimizer)

全代码

# -*- coding: utf-8 -*-
# @Time    : 2021/2/11 17:29
# @Author  : JokerTong
# @File    : 图像分类数据集(三) softmax回归的简洁实现.py
import torch
from torch import nn
from torch.nn import init
import d2lzh as d2l
from collections import OrderedDict

# 读取数据集
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
# 定义和初始化模型

num_inputs = 784
num_outputs = 10


class FlattenLayer(nn.Module):
    def __init__(self):
        super(FlattenLayer, self).__init__()

    # x shape: (batch, *, *, ...)
    def forward(self, x):
        return x.view(x.shape[0], -1)


net = nn.Sequential(
    OrderedDict([
        ('flatten', FlattenLayer()),
        ('linear', nn.Linear(num_inputs, num_outputs))
    ])
)

# 初始化权重参数
init.normal_(net.linear.weight, mean=0, std=0.01)
init.constant_(net.linear.bias, val=0)
# 损失函数
loss = nn.CrossEntropyLoss()
# 优化算法
optimizer = torch.optim.SGD(net.parameters(), lr=0.1)
# 训练模型
num_epochs = 5
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, optimizer)

引用资料来源

本文内容来自吴振宇博士的Github项目
对中文版《动手学深度学习》中的代码进行整理，并用Pytorch实现
【深度学习】李沐《动手学深度学习》的PyTorch实现已完成

Joker-Tong

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
动手学深度学习图像分类数据集(三) softmax回归的简洁实现

动手学深度学习图像分类数据集(三) softmax回归的简洁实现动手学深度学习图像分类数据集:本文的内容是介绍如何通过pytorcj简洁实现sofrmax回归完成对Fashion-MNIST的图像分类资源均可去文末下载 (d2lzh)导入需要的包import torchfrom torch import nnfrom torch.nn import initimport d2lzh as d2l读取数据集batch_size = 256train_iter, test_ite
复制链接

扫一扫