自然语言处理（一）：Pytorch的基础用法

最新推荐文章于 2024-08-04 21:02:33 发布

GeniusAng丶

最新推荐文章于 2024-08-04 21:02:33 发布

阅读量1.3k

点赞数 4

分类专栏：自然语言处理文章标签： pytorch 自然语言处理 python

本文链接：https://blog.csdn.net/weixin_45707277/article/details/120556104

版权

自然语言处理专栏收录该内容

31 篇文章 40 订阅

订阅专栏

自然语言处理笔记总目录

一、Pytorch的安装

1. Pytorch的介绍

Pytorch是一款facebook发布的深度学习框架，由其易用性，友好性，深受广大用户青睐。

2.Pytorch的版本

在这里插入图片描述
详情看：官网

安装命令如上图官网中所示

安装之后打开ipython

In [1]:import torch
In [2]: torch.__version__
Out[2]: '1.9.0+cu111'

注意：代码中都是使用torch

二、张量

1. 张量Tensor

张量是一个统称，其中包含很多类型：

0阶张量：标量、常数，0-D Tensor
1阶张量：向量，1-D Tensor
2阶张量：矩阵，2-D Tensor
3阶张量
…
N阶张量

2. Pytorch中创建张量

1）从已有的数据中创建张量

从列表中创建

torch.tensor([[1., -1.], [1., -1.]])

tensor([[ 1., -1.],
        [ 1., -1.]])

使用numpy中的数组创建tensor

a = np.array([[1,2,3],[4,5,6]])
torch.tensor(a)

tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)

2）创建固定张量

torch.ones(3,4) torch.zeros(3,4)创建3行4列的全为1/0的tensor

torch.ones(3,4)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

torch.ones_like(tensor) torch.zeros_like(tensor)创建与tensor相同形状和数据类型的值全为1/0的tensor

tensor = torch.tensor([[1,-1],[1,-1]])
torch.ones_like(tensor)

tensor([[1, 1],
        [1, 1]])

torch.empty(3,4)创建3行4列的空的tensor，会用无用数据进行填充(手工填充torch.fill_)

empty = torch.empty([3,4])
empty

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

torch.fill_(empty,3)

tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])

3）在一定范围内创建序列张量

torch.arange(start, end, step) 从start到end以step为步长取值生成序列

torch.arange(1,6,2)

tensor([1, 3, 5])

torch.range(1,6)

tensor([1., 2., 3., 4., 5., 6.])

torch.range(start=1, end=6)的结果是会包含end的，
而torch.arange(start=1, end=6)的结果并不包含end。
两者创建的tensor的类型也不一样
arange为torch.int64
range为torch.float32

推荐使用arange，在PyTorch将来的更新中，range会被删除

torch.linspace(start, end, number_steps) 从start到end之间等差生成number_steps个数字组成序列

torch.linspace(1,50,5)

tensor([ 1.0000, 13.2500, 25.5000, 37.7500, 50.0000])

torch.logspace(start, end, number_steps, base=10)在 $base^{start}$ 到 $base^{end}$ 之间等比生成number_steps个数字组成序列

torch.logspace(1,3,10,base=10)

tensor([  10.0000,   16.6810,   27.8256,   46.4159,   77.4264,  129.1550,
         215.4435,  359.3814,  599.4843, 1000.0000])

4）创建随机张量（用的最多）

torch.rand(3,4) 创建3行4列的随机值的tensor，随机值的区间是[0, 1)

torch.rand(3,4)

tensor([[0.6929, 0.9682, 0.8529, 0.6874],
        [0.1254, 0.7655, 0.2407, 0.5344],
        [0.9862, 0.6067, 0.0258, 0.6338]])

torch.randint(low=0,high=10,size=[3,4]) 创建3行4列的随机整数的tensor，随机值的区间是[low, high)

torch.randint(0,10,[3,4])

tensor([[5, 8, 1, 6],
        [9, 9, 0, 6],
        [1, 0, 0, 7]])

torch.randn(3,4) 创建3行4列的随机数的tensor，随机值的分布式均值为0，方差为1

torch.randn(3,4)

tensor([[-1.2575,  0.3230,  0.4924, -0.1722],
        [-0.5662,  0.1221, -1.3786,  1.0081],
        [-0.2797, -1.0588,  0.6110,  0.2682]])

3. Pytorch中tensor的属性

1）获取tensor中的数据

tensor.item() 当tensor中只有一个元素时

a = torch.tensor(np.arange(1))
a

tensor([0], dtype=torch.int32)

b = a.item()
b

Tips: only one element tensors can be converted to Python scalars

转化为numpy数组

a = torch.tensor([[-2.2],[7.3],[-1.5]])
a

tensor([[-2.2000],
        [ 7.3000],
        [-1.5000]])

b = a.numpy()
b

array([[-2.2],
       [ 7.3],
       [-1.5]], dtype=float32)

2）获取形状：tensor.size() tensor.shape

a.size()

torch.Size([3, 1])

a.shape

torch.Size([3, 1])

3）获取数据类型tensor.dtype

a.dtype

torch.float32

在这里插入图片描述
4）获取阶数：tensor.dim()

a.dim()

4.tensor的修改

tensor.view(2,3) tensor.reshape(2,3)

x = torch.tensor([[1,2],[3,4],[5,6]])
x

tensor([[1, 2],
        [3, 4],
        [5, 6]])

x.shape

torch.Size([3, 2])

x.reshape(2,3)

tensor([[1, 2, 3],
        [4, 5, 6]])

x.view(2,3)

tensor([[1, 2, 3],
        [4, 5, 6]])

tensor.t() 或tensor.transpose(dim0, dim1) 转置

tensor([[1, 2],
        [3, 4],
        [5, 6]])

x.t() #也可用x.T

tensor([[1, 3, 5],
        [2, 4, 6]])

x.transpose(0,1)

tensor([[1, 3, 5],
        [2, 4, 6]])

tensor.permute 变更tensor的轴（多轴转置）
需求：把[4,2,3]转置成[3,4,2]，如果使用transpose 需要转两次： [4,2,3] ->[4,3,2]->[3,4,2]

x = torch.tensor([[[1., 2., 3.],
                   [4., 5., 6.]],

                  [[2., 2., 3.],
                   [4., 5., 6.]],

                  [[3., 2., 3.],
                   [4., 5., 6.]],

                  [[4., 2., 3.],
                   [4., 5., 6.]]])

x.shape

torch.Size([4, 2, 3])

y = x.transpose(1,2)
y.shape

torch.Size([4, 3, 2])

z = y.transpose(0,1)
z.shape

torch.Size([3, 4, 2])

如果使用permute，把[4,2,3]变成[3,4,2]只需要调用一次

x.shape

torch.Size([4, 2, 3])

x.permute(2,0,1)

torch.Size([3, 4, 2])

tensor.unsqueeze(dim) tensor.squeeze()填充或者压缩维度
tensor.squeeze() 默认去掉所有长度是1的维度
也可以填入维度的下标，指定去掉某个维度
tensor.unsqueeze() 扩充维度

a = torch.tensor([[[1],
                   [2],
                   [3]]])
a.size()

torch.Size([1, 3, 1])

b = torch.squeeze(a)
b

tensor([1, 2, 3])

b.size()

torch.Size([3])

c = b.unsqueeze(0)
c

tensor([[1, 2, 3]])

c.size()

torch.Size([1, 3])

d = b.unsqueeze(1)
d

tensor([[1],
        [2],
        [3]])

d.size()

torch.Size([3, 1])

类型的指定或修改

创建数据的时候指定类型

a = torch.ones(2,3,dtype = torch.int32)
a.dtype

torch.int32

改变已有tensor的类型

In [17]: a
Out[17]: tensor([1, 2], dtype=torch.int32)

In [18]: a.type(torch.float) #a.float()
Out[18]: tensor([1., 2.])

In [19]: a.double()
Out[19]: tensor([1., 2.], dtype=torch.float64)

tensor的切片

In [101]: x
Out[101]:
tensor([[1.6437, 1.9439, 1.5393],
        [1.3491, 1.9575, 1.0552],
        [1.5106, 1.0123, 1.0961],
        [1.4382, 1.5939, 1.5012],
        [1.5267, 1.4858, 1.4007]])

In [102]: x[:,1]
Out[102]: tensor([1.9439, 1.9575, 1.0123, 1.5939, 1.4858])

切片赋值

In [12]: x[:, 1]
Out[12]: tensor([1.9439, 1.9575, 1.0123, 1.5939, 1.4858])

In [13]: x[:, 1] = 1

In [14]: x[:, 1]
Out[14]: tensor([1., 1., 1., 1., 1.])

注意：切片数据内存不连续

In [87]: a = torch.randn(2,3,4)

In [88]: a
Out[88]:
tensor([[[ 0.6204,  0.9294,  0.6449, -2.0183],
         [-1.1809,  0.4071, -1.0827,  1.7154],
         [ 0.0431,  0.6646,  2.0386,  0.0777]],

        [[ 0.0052, -0.1531, -0.7470, -0.8283],
         [-0.1547,  0.3123, -0.6279, -0.0132],
         [-0.0527, -1.2305,  0.7089, -0.4231]]])

In [89]: a[:,:1,:2]
Out[89]:
tensor([[[ 0.6204,  0.9294]],

        [[ 0.0052, -0.1531]]])

In [90]: a[:,:1,:2].view(1,4)	# 不能用view
---------------------------------------------------------------------------
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

In [91]: a[:,:1,:2].reshape(1,4)
Out[91]: tensor([[ 0.6204,  0.9294,  0.0052, -0.1531]])

5. CUDA Tensor

1.什么是CUDA？

CUDA（Compute Unified Device Architecture）：CUDA™是一种由NVIDIA推出的通用并行计算架构，该架构使GPU能够解决复杂的计算问题（GPU，或者叫做显卡，如果没有cuda这个框架，就只能完成图形渲染）。

2.如何使pytorch能够调用cuda框架（使用gpu完成深度学习计算）？

本机需要有一个NVIDIA的gpu
本机需要安装一个适配的gpu驱动
本机需要安装一个与该gpu适配的CUDA框架
在python环境中安装gpu版本pytorch

3.如何判断当前环境中的pytorch能否调用cuda框架进行计算？

torch.cuda这个模块增加了对CUDA tensor的支持，能够在cpu和gpu上使用相同的方法操作tensor
torch.cuda.is_available()

4.如何把cpu tensor转换成 cuda tensor

通过.to方法能够把一个tensor转移到另外一个设备(比如从CPU转到GPU)

#device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
if torch.cuda.is_available():
    device = torch.device("cuda")          # cuda device对象
    y = torch.ones_like(x, device=device)  # 创建一个在cuda上的tensor
    x = x.to(device)                       # 使用方法把x转为cuda的tensor
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # .to方法也能够同时设置类型
    
>>tensor([1.9806], device='cuda:0')
>>tensor([1.9806], dtype=torch.float64)

6. tensor的常用数学运算

tensor.add tensor.sub tensor.abs tensor.mm

In [204]: a = torch.tensor([1,2,3])

In [205]: b = torch.tensor(1)

In [206]: a.add(b)
Out[206]: tensor([2, 3, 4])

In [207]: a.sub(b)
Out[207]: tensor([0, 1, 2])

In [212]: c = torch.randn(3,)

In [213]: c
Out[213]: tensor([ 0.5161, -0.1732,  1.0162])
In [214]: c.abs()
Out[214]: tensor([0.5161, 0.1732, 1.0162])

In [215]: c
Out[215]: tensor([ 0.5161, -0.1732,  1.0162])

In [254]: a = torch.randn([3,4])

In [255]: b = torch.randn([4, 5])

In [256]: a.mm(b)
Out[256]:
tensor([[ 0.6888,  0.4304, -0.5489,  0.3615, -1.1690],
        [ 1.0890, -1.0391, -0.3717, -0.4045,  3.4404],
        [ 0.9885,  0.1720, -0.2117, -0.1694, -0.5460]])

注意：tensor之间元素级别的数学运算同样适用广播机制。

In [145]: a = torch.tensor([[1,2], [3,4]])

In [146]: b = torch.tensor([1,2])

In [147]: a + b
Out[147]:
tensor([[2, 4],
        [4, 6]])

In [148]: c = torch.tensor([[1,],[2]])

In [149]: a + c
Out[149]:
tensor([[2, 3],
        [5, 6]])

简单函数运算 torch.exp torch.sin torch.cos

In [109]: torch.exp(torch.tensor([0, np.log(2)]))
Out[109]: tensor([1., 2.])

In [110]: torch.tensor([0, np.log(2)]).exp()
Out[110]: tensor([1., 2.])

In [111]: torch.sin(torch.tensor([0, np.pi]))
Out[111]: tensor([ 0.0000e+00, -8.7423e-08])

In [112]: torch.cos(torch.tensor([0, np.pi]))
Out[112]: tensor([ 1., -1.])

in-place 原地操作 tensor.add_ tensor.sub_ tensor.abs_

In [224]: a
Out[224]: tensor([1, 2, 3])

In [225]: b
Out[225]: tensor(1)

In [226]: a.add(b)
Out[226]: tensor([2, 3, 4])

In [227]: a
Out[227]: tensor([1, 2, 3])

In [228]: a.add_(b)
Out[228]: tensor([2, 3, 4])

In [229]: a
Out[229]: tensor([2, 3, 4])

In [236]: c.abs()
Out[236]: tensor([0.5161, 0.1732, 1.0162])

In [237]: c
Out[237]: tensor([ 0.5161, -0.1732,  1.0162])

In [238]: c.abs_()
Out[238]: tensor([0.5161, 0.1732, 1.0162])

In [239]: c
Out[239]: tensor([0.5161, 0.1732, 1.0162])

In [240]: c.zero_()
Out[240]: tensor([0., 0., 0.])

In [241]: c
Out[241]: tensor([0., 0., 0.])

统计操作 tensor.max, tensor.min, tensor.mean,tensor.median tensor.argmax

In [242]: a
Out[242]: tensor([ 0.5161, -0.1732,  1.0162])

In [243]: a.max()
Out[243]: tensor(1.0162)

In [246]: a
Out[246]:
tensor([[ 0.3337, -0.5011, -1.4319, -0.6633],
        [ 0.6620,  1.3154, -0.9129,  0.4685],
        [ 0.3203, -1.6496,  1.1967, -0.3174]])

In [247]: a.max()
Out[247]: tensor(1.3154)

In [248]: a.max(dim=0) # 0:列,1:行
Out[248]:
torch.return_types.max(
values=tensor([0.6620, 1.3154, 1.1967, 0.4685]),
indices=tensor([1, 1, 2, 1]))

In [249]: a.max(dim=0)[0]
Out[249]: tensor([0.6620, 1.3154, 1.1967, 0.4685])

In [250]: a.max(dim=0)[1]
Out[250]: tensor([1, 1, 2, 1])
    
In [251]: a.argmax()
Out[251]: tensor(5)

In [252]: a.argmax(dim=0)
Out[252]: tensor([1, 1, 2, 1])

通过前面的学习，可以发现torch的各种操作几乎和numpy一样

更多tensor的操作，参考官方文档

三、梯度下降和反向传播

1.梯度下降与求导

梯度下降比较简单，就不再赘述了，可以参考这篇文章

2.反向传播算法

3.1 计算图和反向传播

计算图：通过图的方式来描述函数的图形

由以下公式
在这里插入图片描述
把它绘制成计算图可以表示为：

绘制成为计算图之后，可以清楚的看到向前计算的过程

之后，对每个节点求偏导可有
在这里插入图片描述
那么反向传播的过程就是一个上图的从右往左的过程，自变量 $a, b, c$ 各自的偏导就是连线上的梯度的乘积

3.2 神经网络中的反向传播

参考这篇文章或者看一看花书，理解就行，PyTorch里能自动计算。

四、Pytorch自动求导

1. 前向计算

对于pytorch中的一个tensor，如果设置它的属性 .requires_grad为True，那么它将会追踪对于该张量的所有操作。或者可以理解为，这个tensor是一个参数，后续会被计算梯度，更新该参数。

1.1 计算过程

假设有以下条件（1/4表示求均值，xi中有4个数），使用torch完成其向前计算的过程

在这里插入图片描述
如果x为参数，需要对其进行梯度的计算和更新

那么，在最开始随机设置x的值的过程中，需要设置他的requires_grad属性为True，其默认值为False

import torch
x = torch.ones(2, 2, requires_grad=True)  #初始化参数x并设置requires_grad=True用来追踪其计算历史
print(x)
#tensor([[1., 1.],
#        [1., 1.]], requires_grad=True)

y = x+2
print(y)
#tensor([[3., 3.],
#        [3., 3.]], grad_fn=<AddBackward0>)

z = y*y*3  #平方x3
print(x)
#tensor([[27., 27.],
#        [27., 27.]], grad_fn=<MulBackward0>) 

out = z.mean() #求均值
print(out)
#tensor(27., grad_fn=<MeanBackward0>)

从上述代码可以看出：

x的requires_grad属性为True
之后的每次计算都会修改其grad_fn属性，用来记录做过的操作
通过这个函数和grad_fn能够组成一个和前一小节类似的计算图

1.2 requires_grad和grad_fn

a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)  #False
a.requires_grad_(True)  #就地修改
print(a.requires_grad)  #True
b = (a * a).sum()
print(b.grad_fn) # <SumBackward0 object at 0x4e2b14345d21>
with torch.no_gard():
    c = (a * a).sum()  #tensor(151.6830),此时c没有gard_fn
    
print(c.requires_grad) #False

注意：
为了防止跟踪历史记录（和使用内存），可以将代码块包装在with torch.no_grad():中。在评估模型时特别有用，因为模型可能具有requires_grad = True的可训练的参数，但是我们不需要在此过程中对他们进行梯度计算。

2. 梯度计算

对于1.1 中的out而言，我们可以使用backward方法来进行反向传播，计算梯度

out.backward(),此时便能够求出导数 $\frac{dout}{dx}$ ，调用x.grad能够获取导数值

得到

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])

因为：
在这里插入图片描述
在 $x_i$ 等于1时其值为4.5

注意：在输出为一个标量的情况下，我们可以调用输出tensor的backward() 方法，但是在数据是一个向量的时候，调用backward()的时候还需要传入其他参数。

很多时候我们的损失函数都是一个标量，所以这里就不再介绍损失为向量的情况。

loss.backward()就是根据损失函数，对参数（requires_grad=True）的去计算他的梯度，并且把它累加保存到x.gard，此时还并未更新其梯度

注意点：
1.tensor.data:

在tensor的require_grad=False，tensor.data和tensor等价
require_grad=True时，tensor.data仅仅是获取tensor中的数据

2.tensor.numpy():

require_grad=True不能够直接转换，需要使用tensor.detach().numpy()

五、Pytorch完成线性回归

1. 线性回归实现

下面，我们使用一个自定义的数据，来使用torch实现一个简单的线性回归

假设我们的基础模型就是y = wx+b，其中w和b均为参数，我们使用y = 3x+0.8来构造数据x、y，所以最后通过模型应该能够得出w和b应该分别接近3和0.8

准备数据
计算预测值
计算损失，把参数的梯度置为0，进行反向传播
更新参数

import torch
import numpy as np
from matplotlib import pyplot as plt


#1. 准备数据 y = 3x+0.8，准备参数
x = torch.rand([50])
y = 3*x + 0.8

w = torch.rand(1,requires_grad=True)
b = torch.rand(1,requires_grad=True)

def loss_fn(y,y_predict):
    loss = (y_predict-y).pow(2).mean()
    for i in [w,b]:
		#每次反向传播前把梯度置为0
        if i.grad is not None:
            i.grad.data.zero_()
    # [i.grad.data.zero_() for i in [w,b] if i.grad is not None]
    loss.backward()
    return loss.data

def optimize(learning_rate):
    # print(w.grad.data,w.data,b.data)
    w.data -= learning_rate* w.grad.data
    b.data -= learning_rate* b.grad.data

for i in range(3000):
    #2. 计算预测值
    y_predict = x*w + b
	
    #3.计算损失，把参数的梯度置为0，进行反向传播 
    loss = loss_fn(y,y_predict)
    
    if i%500 == 0:
        print(i,loss)
    #4. 更新参数w和b
    optimize(0.01)

# 绘制图形，观察训练结束的预测值和真实值
predict =  x*w + b  #使用训练后的w和b计算预测值

plt.scatter(x.data.numpy(), y.data.numpy(),c = "r")
plt.plot(x.data.numpy(), predict.data.numpy())
plt.show()

print("w",w)
print("b",b)

结果如下：
0 tensor(3.8616)
500 tensor(0.0551)
1000 tensor(0.0185)
1500 tensor(0.0062)
2000 tensor(0.0021)
2500 tensor(0.0007)

w tensor([2.9436], requires_grad=True)
b tensor([0.8340], requires_grad=True)

六、使用Pytorch模型组件实现线性回归

import torch
from torch import nn
from torch import optim
import numpy as np
import matplotlib.pyplot as plt


x = torch.rand(50,1)
y = x*3+0.8

class Lr(nn.Module):
    def __init__(self):
        super(Lr,self).__init__()
        self.linear = nn.Sequential(
            nn.Linear(1,8),
            nn.Linear(8,1)
        )
        
        
    def forward(self,x):
        out = self.linear(x)
        return out


model = Lr()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=1e-3)

epoch = 5000
for i in range(epoch):
    out = model(x)
    loss = criterion(y,out)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    if (i+1) % 100 == 0:
        print("Epoch{}/{}, Loss:{:6f}".format(i,epoch,loss.data))

model.eval()
predict = model(x)
output = predict.data.numpy()
plt.scatter(x.data.numpy(), y.data.numpy())
plt.plot(x.data.numpy(), output, c='r')
plt.show()

运行结果：
在这里插入图片描述

使用GPU：

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# 之后将x, y, model全部to(device)即可

GeniusAng丶

关注

4
点赞
踩
9

收藏

觉得还不错? 一键收藏
打赏
0
评论
自然语言处理（一）：Pytorch的基础用法

Pytorch的基础用法
复制链接

扫一扫

专栏目录