沐神-动手学深度学习-多层感知机

T_FLY1999

已于 2023-02-04 16:28:11 修改

阅读量161

点赞数

分类专栏： 2022年暑假-跟沐神动手学深度学习文章标签：学习 python Powered by 金山文档

于 2023-02-04 16:25:25 首次发布

本文链接：https://blog.csdn.net/qq_52410284/article/details/128882666

版权

2022年暑假-跟沐神动手学深度学习专栏收录该内容

6 篇文章 0 订阅

订阅专栏

感知机的概念

感知机原理用公式表示如下图所示，相当于对线性回归模型外又加了一层函数，变成二分类问题：

$\text{[math]}$

在训练过程中，感知机模型的损失函数与线性回归的差平方损失函数定义不同。其损失函数定义如下：

$\text{[math]}$

在感知机模型中，如果预测值大于0，预测为1类；如果预测值小于0，预测为-1类。因此如果预测正确，预测值与真值的乘积大于0，损失为0；预测错误，乘积为小于0，损失为乘积的相反数。

对该函数进行梯度下降，得到迭代公式如下：

但是感知机由于感知机模型为线性模型，难以解决xor问题，导致在一段时间内发展低迷。后面出现了多层感知机以及激活函数，将感知机模型转化为非线性模型，解决了这一问题。

代码实现

其实代码与softmax回归大同小异，区别在于加入隐藏层和激活函数。

import torch
from torch import nn
from d2l import torch as d2l

##数据读取
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)

##创建模型并进行参数初始化
num_input = 784
num_hid = 128
num_output = 10
net = nn.Sequential(nn.Flatten(), nn.Linear(num_input, num_hid), nn.ReLU(), nn.Linear(num_hid, num_output))

#调用nn的初始化模块将线性层weight初始化
def init_weight(m):
    if type(m) == nn.Linear:
        nn.init.normal_(m.weight, std = 0.01)
        
net.apply(init_weight)

loss = nn.CrossEntropyLoss()

trainer = torch.optim.SGD(net.parameters(), lr = 0.1)

num_epochs = 10

for epochs in range(num_epochs):
    len_y = 0 
    true_y = 0
    for X, y in train_iter:
        y_hat = net(X)
        l = loss(y_hat, y)
        trainer.zero_grad()
        l.mean().backward()
        trainer.step()
        true_y+= d2l.accuracy(y_hat,y)
        len_y+= len(y)
    print(true_y/len_y)
print(d2l.evaluate_accuracy(net, test_iter))