numpy原生，pytorch原生以及面向对象的方式来实现简单的全连接神经网络

最新推荐文章于 2023-12-24 00:20:42 发布

@秋野

最新推荐文章于 2023-12-24 00:20:42 发布

阅读量263

点赞数 1

分类专栏：经典模型文章标签： pytorch 神经网络深度学习

本文链接：https://blog.csdn.net/BGMcat/article/details/120578177

版权

经典模型专栏收录该内容

8 篇文章 1 订阅

订阅专栏

简述：使用两种数据集，多种方法，多向对比

分类任务使用手写数字数据集，小批量梯度下降法，全连接神经网络的输入层为784个神经元，隐藏层为100个神经元，输出层10个神经元。损失函数为交叉熵代价函数，激活函数为sigmoid函数。
回归任务使用自构随机数数据集，全连接神经网络的输入层为1000个神经元，隐藏层为100个神经元，输出层10个神经元。损失函数为均方误差代价函数，激活函数为y=x函数。

一、回归任务使用自构随机数数据集

numpy实现

import numpy as np
import torch
x = np.random.randn(64, 1000)#正态分布
y = np.random.randn(10)
w1 = np.random.randn(1000, 100)
w2 = np.random.randn(100, 10)

lr = 0.000001
epoxhs = 500
for i in range(500):
    # forward
    h = x.dot(w1)
    h_relu = np.maximum(h, 0)
    y_pred = h_relu.dot(w2)
    
    # 计算损失
    loss = np.square(y_pred - y).sum()
    print(i, loss)

    # backward
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.T.dot(grad_y_pred)  # w2的梯度
    grad_h_relu = grad_y_pred.dot(w2.T)
    grad_h = grad_h_relu.copy()
    grad_h[h<0] = 0
    grad_w1 = x.T.dot(grad_h) # w1的梯度

    # 更新参数
    w1 -= lr * grad_w1
    w2 -= lr * grad_w2

执行结果：
······
494 2.309283325916261e-05
495 2.2157265382087743e-05
496 2.1259504455147672e-05
497 2.0398317193778056e-05
498 1.9572198422014806e-05
499 1.8779967141503274e-05

pytorch实现
由于pytorch对numpy进行了封装，所以很多属性方法同numpy类似

x = torch.randn(64, 1000)
y = torch.randn(10)
w1 = torch.randn(1000, 100,requires_grad = True)
w2 = torch.randn(100, 10,requires_grad = True)

lr = 0.000001
for i in range(500):
    # forward
    h = x.mm(w1)
    h_relu = h.clamp(min=0)#激活函数ReLU
    y_pred = h_relu.mm(w2)
    
    # 计算损失
    loss = (y_pred - y).pow(2).sum()
    print(i, loss)
#手动求导更新
#     # backward
#     grad_y_pred = 2.0 * (y_pred - y)
#     grad_w2 = h_relu.t().mm(grad_y_pred)  # w2的梯度
#     grad_h_relu = grad_y_pred.mm(w2.t())
#     grad_h = grad_h_relu.clone()
#     grad_h[h<0] = 0
#     grad_w1 = x.t().mm(grad_h) # w1的梯度

#     # 更新参数
#     w1 -= lr * grad_w1
#     w2 -= lr * grad_w2
#自动求导更新
    # backward
    loss.backward()

    # 更新参数
    w1.data.add_(-lr * w1.grad)
    w2.data.add_(-lr * w2.grad)
    w1.grad.zero_()
    w2.grad.zero_()

执行结果：
······
492 tensor(1.7772e-05, grad_fn=<SumBackward0>)
493 tensor(1.7666e-05, grad_fn=<SumBackward0>)
494 tensor(1.7485e-05, grad_fn=<SumBackward0>)
495 tensor(1.7217e-05, grad_fn=<SumBackward0>)
496 tensor(1.7055e-05, grad_fn=<SumBackward0>)
497 tensor(1.6938e-05, grad_fn=<SumBackward0>)
498 tensor(1.6785e-05, grad_fn=<SumBackward0>)
499 tensor(1.6566e-05, grad_fn=<SumBackward0>)

二、分类任务使用手写数字数据集(小批量梯度下降)

使用 nn.Module 重构
相比于原来，封装整个框架

import torch
import numpy as np
class Mnist_Logistic(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.weights1 = torch.nn.Parameter(torch.randn(784,100))#第一层权重，正态分布随机数
        self.weights2 = torch.nn.Parameter(torch.randn(100,10))
        self.bias1 = torch.nn.Parameter(torch.randn(100))#第一层偏置值
        self.bias2 = torch.nn.Parameter(torch.randn(10))
        
    def forward(self,x):    #前向传播函数，不可改名
        x = x@self.weights1+self.bias1     #@==*
        x =  torch.sigmoid(x)  #激活函数
        return x @self.weights2+self.bias2

    def fit(self,x,y,lr,epoxhs,bs):
        for i in range(epoxhs):
            start = 0
            end = bs
            while end<=x.shape[0]:
                xb = x[start:end]
                yb = y[start:end]
                start = end
                end +=bs
                pred = self(xb)
                loss = torch.nn.functional.cross_entropy(pred,yb)#分类任务，交叉熵代价函数
                loss.backward()
                with torch.no_grad():
                    for p in self.parameters(): #更新每一个参数
                        p -=p.grad*lr
                    self.zero_grad()
        return loss
        
#共用部分，下面将不再重复
lr = 0.0001#学习率
bs = 64 #小批量的样本数
epoxhs = 30 #迭代轮数
train_X, test_X, train_y, test_y = np.load('./mnist.npy', allow_pickle=True)
x_train = train_X.reshape(60000, 28*28).astype(np.float32)
x_test = test_X.reshape(10000, 28*28).astype(np.float32)
x_train, y_train, x_test, y_test = map(torch.tensor, (x_train, train_y, x_test, test_y))#转换类型ndarray->tonser
x,y = x_train, y_train

#实例化模型
model = Mnist_Logistic()       
model.fit(x,y,lr,epoxhs,bs)

执行结果：
tensor(5.3542, grad_fn=<NllLossBackward>)

使用 nn.Linear 重构
PyTorch 的 nn.Linear 类建立一个线性层，以替代手动定义和初始化 self.weights 和 self.bias、计算 xb @ self.weights + self.bias 等工作。

import torch

class Mnist_Logistic(torch.nn.Module):
    def __init__(self,in_feartures,hid_feartures,out_feartures):
        super().__init__()
        self.layer1 = nn.Linear(in_feartures,hid_feartures)  #第一个线性层
        self.layer2 = nn.Linear(hid_feartures,out_feartures)
        
    def forward(self,x):#前向传播函数
        x = self.layer1(x)
        x = torch.sigmoid(x)
        return self.layer2(x)

	#与上面重复
    def fit(self,x,y,lr,epoxhs,bs):
        for i in range(epoxhs):
            start = 0
            end = bs
            while end<=x.shape[0]:
                xb = x[start:end]
                yb = y[start:end]
                start = end
                end +=bs
                pred = self(xb)
                loss = torch.nn.functional.cross_entropy(pred,yb)#分类任务
                loss.backward()
                with torch.no_grad():
                    for p in self.parameters():
                        p -=p.grad*lr

                    self.zero_grad()
        return loss
    
modle = Mnist_Logistic(784,100,10)
modle.fit(x,y,lr,epoxhs,bs)

执行结果：
tensor(0.8863, grad_fn=<NllLossBackward>)

使用 optim 重构（完整包含交叉熵，L2，早停策略以及测试集的预测）
我们可以使用优化器中的 step 方法来执行前向步骤，而不是手动更新参数

import numpy as np
import torch
from torch import nn
class Mnist_Logistic(torch.nn.Module):
    def __init__(self,in_feartures,hid_feartures,out_feartures):
        super().__init__()
        self.layer1 = nn.Linear(in_feartures,hid_feartures)#第一层
        self.layer2 = nn.Linear(hid_feartures,out_feartures)#第二层
        self.patience = 20  #耐心，损失累积不减的最大次数
    def forward(self,x):#前向传播函数
        x = self.layer1(x)
        x = torch.sigmoid(x)
        return self.layer2(x)

    
    def fit(self,x,y,lr,epoxhs,bs):
        opt = torch.optim.SGD(self.parameters(), lr=lr)
        x_val,y_val = x[0:500],y[0:500] #验证集
        train_x,train_y = x[500:],y[500:] #训练集
        worse_times =0  #损失连续不减少的次数
        best_para = 0  #最优参数
        best_step = 0  #最好的训练步长
        min_loss = 0   #最小损失
        for i in range(epoxhs):
            start = 0
            end = bs
            while end<=train_x.shape[0]:
                xb = train_x[start:end]
                yb = train_y[start:end]
                start = end
                end +=bs
                #训练集预测结果
                pred = self(xb)
                #引入L2正则化项
               	sum_1 = sum([(p**2).sum() for p in self.parameters() ])
                L2 = torch.pow(sum_1,0.5)
                loss = torch.nn.functional.cross_entropy(pred,yb)+L2#分类任务,交叉熵
                loss.backward()
                #更新参数
                opt.step()
                opt.zero_grad()
                #初始化将第一次的参数，损失，步长赋值给最优解
                if start ==0 and i==0:
                    best_para = self.parameters()
                    best_step = [i,start//bs]
                    min_loss = loss
                #每10次进行一次验证
                if end % (bs*10) == 0:
                    pred_val = self(x_val)
                    loss_val = torch.nn.functional.cross_entropy(pred_val,y_val)+L2
                    #如果当前损失小，更新参数
                    if loss_val < loss:
                        best_para = self.parameters()
                        worse_times =0
                        best_step = [i,start//bs]
                        min_loss = loss_val
                    #否则不更新，错误次数加一
                    else:
                        worse_times +=1
                #达到耐心值，提前终止
                if worse_times == self.patience:
                    return best_para,best_step,min_loss
        #返回最优参数，w,b,最优步长，最小损失
        return best_para,best_step,min_loss
    
    
lr = 0.0001   #学习率
bs = 64        #小批量样本数
epoxhs = 100   #迭代轮数
#导入数据
train_X, test_X, train_y, test_y = np.load('./mnist.npy', allow_pickle=True)
x_train = train_X.reshape(60000, 28*28).astype(np.float32)
x_test = test_X.reshape(10000, 28*28).astype(np.float32)
#转换tensor类型
x_train, y_train, x_test, y_test = map(torch.tensor, (x_train, train_y, x_test, test_y))
x,y = x_train, y_train 
#实例化模型
modle = Mnist_Logistic(784,100,10)
best_para,best_step,min_loss =modle.fit(x,y,lr,epoxhs,bs)
    
para_list = [i for i in best_para]#保存w.b
#开始预测验证集
w1,b1,w2,b2 = para_list[0],para_list[1],para_list[2],para_list[3]
pred_test_y = torch.softmax(torch.sigmoid(x_test @ w1.T +b1) @w2.T +b2)  #返回预测的各类概率
from sklearn.metrics import accuracy_score

print(y_test)
pred_test_y = torch.argmax(pred_test_y,axis = 1) #返回预测的类别
print(pred_test_y)
accuracy_score(y_test,pred_test_y) #计算准确率

执行结果：
tensor([7, 2, 1, …, 4, 5, 6]) #test
tensor([7, 2, 1, …, 4, 5, 6]) #predict
0.9124 #accuracy

使用 Dataset 重构
PyTorch 有一个抽象的 Dataset 类。数据集可以是具有len函数(由 Python 的标准len函数调用）和具有getitem函数作为对其进行索引的一种方法。 PyTorch 的 TensorDataset 是一个数据集包装张量。通过定义索引的长度和方式，这也为我们提供了沿张量的一维进行迭代，索引和切片的方法。这将使我们在训练的同一行中更容易访问自变量和因变量。

from torch.utils.data import TensorDataset
train_ds = TensorDataset(x_train, y_train)#将x,y合并为同一个tensordataset对象，更方便切片
xb,yb = train_ds[i*bs : i*bs+bs]

使用 DataLoader 进行重构
Pytorch 的DataLoader负责批次管理。您可以从任何Dataset创建一个DataLoader。 DataLoader使迭代迭代变得更加容易。不必使用train_ds[ibs : ibs+bs]，DataLoader 会自动为我们提供每个小批量。

from torch.utils.data import DataLoader

train_ds = TensorDataset(x_train, y_train)#合并x,y
train_dl = DataLoader(train_ds, batch_size=bs)#小批量自动迭代

for xb,yb in train_dl:
    pred = model(xb)

最终优化结果：（无范数，无早停）

opt = torch.optim.SGD(self.parameters(), lr=lr)
for epoch in range(epochs):
    for xb, yb in train_dl:
        pred = model(xb)
        loss = loss_func(pred, yb)

        loss.backward()
        opt.step()
        opt.zero_grad()

可以看出每一次重构都是在前一次的基础上进行简化步骤，不断优化，所以一般线性层重构，并使用step来自动更新参数。至于优化模型的方法，还可以引入正则化项（L1,L2）和提前终止策略，dropout，数据增强等。

@秋野

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
numpy原生，pytorch原生以及面向对象的方式来实现简单的全连接神经网络

简述：使用两种数据集，多种方法，多向对比分类任务使用手写数字数据集，小批量梯度下降法，全连接神经网络的输入层为784个神经元，隐藏层为100个神经元，输出层10个神经元。损失函数为交叉熵代价函数，激活函数为sigmoid函数。回归任务使用自构随机数数据集，全连接神经网络的输入层为1000个神经元，隐藏层为100个神经元，输出层10个神经元。损失函数为均方误差代价函数，激活函数为y=x函数。一、回归任务使用自构随机数数据集numpy实现import numpy as npimport t
复制链接

扫一扫