【无标题】

最新推荐文章于 2024-08-16 00:28:12 发布

YZzzz...

最新推荐文章于 2024-08-16 00:28:12 发布

阅读量175

点赞数

文章标签： python 深度学习机器学习

本文链接：https://blog.csdn.net/weixin_50713798/article/details/125600834

版权

刘二Pytorch学习笔记

1.前置知识
- 1.1pyplot使用
2.线性模型
3.梯度下降算法（分治法）
4.反向传播
5.用pytorch实现线性回归
6.逻辑斯蒂回归
7.处理多维特征的输入
8.加载数据集

1.前置知识

1.1pyplot使用

前置知识
pyplot基本使用
在这里插入图片描述

2.线性模型

在这里插入图片描述

用穷举法模拟y = w * x，穷举了w的40种情况，发现当w = 2 时 loss = 0最小

import numpy as np
import matplotlib.pyplot as plt
#y = 2*x
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]

def forward(x):#根据穷举出来的w，计算在该w的情况下得到的y'
    return x*w
def loss(x,y):#根据得到的y'，计算loss值
    y_pred = forward(x)
    return (y-y_pred)**2
#穷举法
w_list = []
mse_list = []
for w in np.arange(0.0,4.1,0.1):#以0.1为间隔，穷举w
    print("w=",w)
    l_sum=0
    for x_val,y_val in zip(x_data,y_data):
        y_pred_val = forward(x_val)
        loss_val = loss(x_val,y_val)
        l_sum+=loss_val
        print('\t',x_val,y_val,y_pred_val,loss_val)
    print('MSE=',l_sum/3)
    w_list.append(w)
    mse_list.append(l_sum/3)
plt.plot(w_list,mse_list)
plt.ylabel('Loss')
plt.xlabel('w')
plt.show()

对于作业而言 y = x * w + b 应该使用3D绘图

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D

# y = x*2.5-1 构造训练数据
x_data = [1.0, 2.0, 3.0]
y_data = [1.5, 4.0, 6.5]
W, B = np.arange(0.0, 4.1, 0.1), np.arange(-2.0, 2.1, 0.1)  # 规定 W,B 的区间
w, b = np.meshgrid(W, B, indexing='ij')  # 构建矩阵坐标


def forward(x):
    return x * w + b#返回的是一个矩阵


def loss(y_pred, y):
    return (y_pred - y) * (y_pred - y)


# Make data.
mse_lst = []
l_sum = 0.
for x_val, y_val in zip(x_data, y_data):
    y_pred_val = forward(x_val)
    loss_val = loss(y_pred_val, y_val)
    l_sum += loss_val
mse_lst.append(l_sum / 3)

# 定义figure
fig = plt.figure(figsize=(10, 10), dpi=300)
# 将figure变为3d
ax = Axes3D(fig)
# 绘图，rstride:行之间的跨度  cstride:列之间的跨度
surf = ax.plot_surface(w, b, np.array(mse_lst[0]), rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=0,
                       antialiased=False)
# Customize the z axis.
ax.set_zlim(0, 40)
# 设置坐标轴标签
ax.set_xlabel("w")
ax.set_ylabel("b")
ax.set_zlabel("loss")
ax.text(0.2, 2, 43, "Cost Value", color='black')
# Add a color bar which maps values to colors.
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()

在这里插入图片描述
运行结果和答案基本一致
学到的知识有
numpy.meshgrid（*xi， copy=True， sparse=False， indexing=‘xy’)：

这个函数只是为了获得足够多的点格从而获得最优的w，b；

3.梯度下降算法（分治法）

针对凸函数
在这里插入图片描述
学习率a尽可能小，保障不会出现数值一次变化过大的情况，让w朝着梯度下降方向进行，（由于深度学习大部分算法对应的函数模型并不会非凸，所以我们应该防止的是鞍点）

在这里插入图片描述
利用上面的公式建立梯度下降代码

import matplotlib.pyplot as plt


x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]

w = 1.0#初始权重

def forward(x):# 预测值
    return x * w

def cost(xs , ys): # 计算所有的x，y的cost平均值
    cost = 0
    for x , y in zip(xs,ys):
        y_prep = forward(x)
        cost += (y_prep - y) ** 2
    return cost / len(xs)


def gradient(xs,ys):# 对每一个w求梯度均值
    grad = 0
    for x, y in zip(xs,ys):
        grad += 2 * x * (x * w - y)
    return  grad / len(xs)
e_list = []
cost_list = []
for e in range(100):
    cost_val = cost(x_data,y_data)
    grad_val = gradient(x_data,y_data)
    w -= 0.01 * grad_val
    cost_list.append(cost_val)
    # print(e)
    e_list.append(e)
plt.plot(e_list,cost_list)
plt.xlabel('e')
plt.ylabel('cost')
plt.show()

在这里插入图片描述
cost随着尝试次数增多逐渐下降，所以开始找到最合适的w。

4.反向传播

在这里插入图片描述

import torch
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]

w = torch.Tensor([1.0])#一种元素类型
#print(w)
w.requires_grad = True#保存w的梯度

def forward(x):
    return x * w
def loss(x , y):#损失函数
    y_pred = forward(x)
    return (y_pred - y) ** 2

for e in range(100):
    for x , y in zip(x_data,y_data):
        l = loss(x,y)
        l.backward()#计算梯度对于requires_grad为true的数，现在看即是l对于w的导数
        print("\t grad:",x , y,w.grad.item())
        w.data = w.data - 0.01 * w.grad.data#梯度用于更新权重
        w.grad.data.zero_()#将计算后的梯度重置为0

    print("process:" ,e,l.item())

print("pridict (after training)", 4, forward(4).item)

语法注释在上，主要是将梯度储存再用。

5.用pytorch实现线性回归

用pytorch的工具更好的完成线性模型（优点：比之前的扩展性强）
四个阶段如下：
在这里插入图片描述

import torch

# prepare dataset
# x,y是矩阵，3行1列 也就是说总共有3个数据，每个数据只有1个特征
x_data = torch.tensor([[1.0], [2.0], [3.0]])
y_data = torch.tensor([[2.0], [4.0], [6.0]])

# design model using class
"""
our model class should be inherit from nn.Module, which is base class for all neural network modules.
member methods __init__() and forward() have to be implemented
class nn.linear contain two member Tensors: weight and bias
class nn.Linear has implemented the magic method __call__(),which enable the instance of the class can
be called just like a function.Normally the forward() will be called 
"""


class LinearModel(torch.nn.Module):
    def __init__(self):
        super(LinearModel, self).__init__()
        # (1,1)是指输入x和输出y的特征维度，这里数据集中的x和y的特征都是1维的
        # 该线性层需要学习的参数是w和b  获取w/b的方式分别是~linear.weight/linear.bias
        self.linear = torch.nn.Linear(1, 1)#初始花了w，b;

    def forward(self, x):
        y_pred = self.linear(x)#计算预测值
        return y_pred


model = LinearModel()

# construct loss and optimizer
criterion = torch.nn.MSELoss(size_average = False)
# criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)  # model.parameters()自动完成参数的初始化操作

# training cycle forward, backward, update
for epoch in range(20):# 自动计算次数尽可能多
    y_pred = model(x_data)  # forward:predict
    loss = criterion(y_pred, y_data)  # forward: loss
    print(epoch, loss.item())

    optimizer.zero_grad()  # the grad computer by .backward() will be accumulated. so before backward, remember set the grad to zero
    loss.backward()  # backward: autograd，自动计算梯度
    optimizer.step()  # update 参数，即更新w和b的值

print('w = ', model.linear.weight.item())
print('b = ', model.linear.bias.item())

x_test = torch.tensor([[4.0]])
y_test = model(x_test)
print('y_pred = ', y_test.data)# 测试数据与实际大致相同

注释如上，

6.逻辑斯蒂回归

损失值公式改变：
在这里插入图片描述
前置知识：BCEloss

class torch.nn.BCELoss(weight: Optional[torch.Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean')

在这里插入图片描述

import torch

# import torch.nn.functional as F

# prepare dataset
x_data = torch.Tensor([[1.0], [2.0], [3.0]])
y_data = torch.Tensor([[0], [0], [1]])# 类别显示


# design model using class
class LogisticRegressionModel(torch.nn.Module):
    def __init__(self):
        super(LogisticRegressionModel, self).__init__()
        self.linear = torch.nn.Linear(1, 1)

    def forward(self, x):
        # y_pred = F.sigmoid(self.linear(x))
        y_pred = torch.sigmoid(self.linear(x))# 把值映射到0到1之间且不同值对应0到1之间的不同值
        return y_pred


model = LogisticRegressionModel()

# construct loss and optimizer
# 默认情况下，loss会基于element平均，如果size_average=False的话，loss会被累加。
criterion = torch.nn.BCELoss(size_average=False)#创建BCEloss这个对象，用于计算loss值
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# training cycle forward, backward, update
for epoch in range(10000):
    y_pred = model(x_data)
    loss = criterion(y_pred, y_data)
    print(epoch, loss.item())

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

print('w = ', model.linear.weight.item())
print('b = ', model.linear.bias.item())

x_test = torch.Tensor([[4.0]])
y_test = model(x_test)
print('y_pred = ', y_test.data)

代码注释如上

7.处理多维特征的输入

该神经网络共3层；第一层是8维到6维的非线性空间变换，第二层是6维到4维的非线性空间变换，第三层是4维到1维的非线性空间变换。

import numpy as np
import torch
import matplotlib.pyplot as plt

# prepare dataset
xy = np.loadtxt('diabetes.csv', delimiter=',', dtype=np.float32)
x_data = torch.from_numpy(xy[:, :-1])  # 第一个‘：’是指读取所有行，第二个‘：’是指从第一列开始，最后一列不要
y_data = torch.from_numpy(xy[:, [-1]])  # [-1] 最后得到的是个矩阵


# design model using class


class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear1 = torch.nn.Linear(8, 6)  # 输入数据x的特征是8维，x有8个特征
        self.linear2 = torch.nn.Linear(6, 4)
        self.linear3 = torch.nn.Linear(4, 1)
        self.sigmoid = torch.nn.SigmoID()

    def forward(self, x):
        x = self.sigmoid(self.linear1(x))
        x = self.sigmoid(self.linear2(x))
        x = self.sigmoid(self.linear3(x))  # y hat
        return x


model = Model()

criterion = torch.nn.BCELoss(reduction='mean')
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

epoch_list = []
loss_list = []
# training cycle forward, backward, update
for epoch in range(10000):
    y_pred = model(x_data)
    loss = criterion(y_pred, y_data)
    print(epoch, loss.item())
    epoch_list.append(epoch)
    loss_list.append(loss.item())

    optimizer.zero_grad()
    loss.backward()

    optimizer.step()

plt.plot(epoch_list, loss_list)
plt.ylabel('loss')
plt.xlabel('epoch')
plt.show()

发现和之前的代码已经没有多大改变

8.加载数据集

代码块如下图：
在这里插入图片描述
加了dataloader和dataset；
dataset，我们需要在自定义的数据集类中继承Dataset类，同时还需要实现两个方法:
__len__方法, 能够实现通过全局的len()方法获取其中的元素个数；
getitem 方法，能够通过传入索引的方式获取数据，例如通过dataset[i]获取其中的第 i i i条数据。
Dataloader:在PyTorch中torch.utils.data.DataLoader提供了上述的所有方法
1.批处理数据（Batching the data）
2.打乱数据（Shuffling the data）
3.使用多线程multiprocessing并行加载数据
其中参数的含义：
1、dataset：提前定义的dataset的实例；
2、batch_size：传入数据的batch大小，常常是32、64、128、256’
3、shuffle：bool类型，表示是否在每次获取数据的时候提前打乱数据；
4、num_workers：加载数据的线程数。
5、drop_last：bool类型，为真，表示最后的数据不足一个batch，就删掉.