【Pytorch学习笔记】Day01 - Pytorch的基本操作

Clown Piece

已于 2023-04-03 16:05:00 修改

阅读量397

点赞数 1

分类专栏：深度学习笔记文章标签： pytorch 学习 python

于 2023-02-21 11:20:57 首次发布

本文链接：https://blog.csdn.net/oo00Z00oo/article/details/128987216

版权

深度学习笔记专栏收录该内容

8 篇文章 1 订阅

订阅专栏

【Pytorch学习笔记】Day01 - Pytorch的基本操作

文章目录

【Pytorch学习笔记】Day01 - Pytorch的基本操作

一、创建Tensor

在PyTorch中，torch.Tensor是存储和变换数据的主要工具。Tensor与NumPy的多维数组相似，但提供GPU计算和自动求梯度等额外功能，更适用于深度学习。

"tensor"可以翻译成“张量”，可以看作一个多维数组。标量看成0维张量，向量看作1维张量，矩阵可以看作是2维张量。

创建Tensor的函数有多种，可以简单记忆下。

函数	功能
Tensor(*sizes)	基础构造函数
tensor(data,)	类似np.array的构造函数
ones(*sizes)	创建全1的Tensor
zeros(*sizes)	创建全0的Tensor
eye(*sizes)	对角线为1，其他为0
arrange(s,e,step)	从s到e，步长为step
linspace(s,e,steps)	从s到e，均匀切分成steps份}
rand/randn(*sizes)	均匀/标准分布
normal(mean,std)/(uniform(from,to)	正态分布/均匀分布
randperm(m)	随机整数

利用 import torch 导入 pytorch 。创建一个 4 x 3 empty的 Tensor：

x = torch.empty(4,3)
print(x)

可以看到行列序号都是从1开始。
输出：

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

其效果应该和x = torch.zeors(4,3, dtype=torch.long) 一样，创建5 x 3 long型全0 Tensor。

还可以直接根据数据创建：

x = torch.tensor([4.5,3])
print(x)

输出：

tensor([4.5000, 3.0000])

还可以通过现有的Tensor创建，默认会重用输入的Tensor的一些属性，如数据类型，除非重新自定义。

x = x.new_ones(4, 3, dtype=torch.float64) #返回的tensor默认具有相同的torch.dtype和torch.device
print(x)

x = torch.randn_like(x, dtype=torch.float) #指定新的数据类型
print(x)

输出：

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[-1.1401,  1.2186, -1.2832],
        [ 0.1070,  0.2736, -0.2655],
        [-1.7234,  0.2947,  1.6515],
        [ 0.2684, -0.4540, -0.7197]])

可以通过shape或者size()来获取Tensor的形状：

print(x.size())
print(x.shape)

输出：

torch.Size([4, 3])
torch.Size([4, 3])

P.S.：以上创建方法都可以在创建时指定数据类型dtype和存放device（CPU/GPU）

二、数据操作

Tensor包含多种相关操作。

2.1 算术操作

在PyTorch中，同一种有多种不同写法，以加法为例。
形式一：

x = torch.ones(4,3)
y = torch.rand(4,3)
print(x)
print(y)
print(x + y)

形式二：

print(torch.add(x,y))

或

result = torch.empty(4,3)
torch.add(x,y,out=result)
print(result)

形式三：

y.add_(x)
print(y)

P.S.：Pytorch操作inplace版本都有后缀_，例如x.copy_(y)，x.t_（）
输出都是：

tensor([[1.3187, 1.1892, 1.9779],
        [1.3361, 1.4684, 1.8732],
        [1.8074, 1.2719, 1.8999],
        [1.3116, 1.2101, 1.1632]])

2.2 索引

我们可以用类似NumPy的索引操作来访问Tensor的一部分，要注意索引出的结果和原数据是共享内存的，修改其索引，原数据也会改变。

x = torch.rand(4,3)
print(x)
y = x[0,:]   #取出第一排，所有列
y += 1
print(y)
print(x[0,:]) #源tensor也被修改

输出：

tensor([[0.6372, 0.0263, 0.7210],
        [0.7333, 0.4041, 0.9346],
        [0.2979, 0.7893, 0.6025],
        [0.7531, 0.8830, 0.6843]])
tensor([1.6372, 1.0263, 1.7210])
tensor([1.6372, 1.0263, 1.7210])

除了常用的索引选择方式，PyTorch还提供了一些高级的选择函数：

函数	功能
index_select(input, dim, index)	在指定维度dim上选取，比如选取某些行、某些列
masked_select(input, mask)	a[a>0]，使用ByteTensor进行选取
nonzero(input)	非0元素的下标

2.3 改变形状

（1）用 view( shape ) 来改变 Tensor 的形状，将原 Tensor 改为其他的维度。该函数返回一个有__相同数据__但不同大小的 Tensor。通俗一点，就是__改变矩阵维度__，相当于 Numpy 中的 resize() 或者 Tensorflow 中的 reshape() 。

x = torch.randn(4, 4)
print(x.size())
y = x.view(16)
print(y.size())
z = x.view(-1, 8)  #-1表示不确定有几行，但是肯定有8列，因为16个元素8列，所以有2行，-1那里会自动替换为2
print(z.size())
m = x.view(2, 2, 4) #也可以变为更多维度
print(m.size())

输出：

torch.Size([4, 4])
torch.Size([16])
torch.Size([2, 8])
torch.Size([2, 2, 4])

值得注意的是：一共有多少个元素，那你使用view改变形状就要乘起来正好是这么多元素，否则一定会出错。

（2）view( -1 )
若我们需要转换维度到一维，有一种简单的方式，即参数为-1。

a = torch.Tensor([[1, 2, 3], [4, 5, 6]]) #定义一个 2*3 的 Tensor
a = a.view(-1)
print(a)

输出：

tensor([1., 2., 3., 4., 5., 6.])

从结果可以看出，其将每一行拼成了一行。
在实际操作中，我们经常看到这个写法：

data.contiguous().view(-1)

contiguous()是为了保证一个Tensor是连续的，这样才能被view()处理。

（3） view_as( other )
返回被视作与给定Tensor相同大小的原Tensor。等效于：self.view(tensor.size())。

a = torch.arange(0, 6)
a = a.view(2, 3)
print(a)
b = torch.arange(1,7)
print(b)
b = b.view_as(a)
print(b)

输出：

tensor([[0, 1, 2],
        [3, 4, 5]])
tensor([1, 2, 3, 4, 5, 6])
tensor([[1, 2, 3],
        [4, 5, 6]])

（4）data的互通
view( )返回新的 Tensor 与源 Tensor 虽然形状可能不同，但是数据是共享的，即更改其中一个的data，另一个也会改变，view仅仅是改变了对这个张量的观察角度，其数据仍然是那些数据。那如果我们想创建一个副本怎么做呢？可以先用clone( )函数创造一个副本，然后再用view( )。

x = torch.rand(2,4)
print(x)
x_cp = x.clone().view(8)
x -= 1
print(x)
print(x_cp)

输出：

tensor([[0.2856, 0.0738, 0.7755, 0.4160],
        [0.8703, 0.2018, 0.7837, 0.4459]])
tensor([[-0.7144, -0.9262, -0.2245, -0.5840],
        [-0.1297, -0.7982, -0.2163, -0.5541]])
tensor([0.2856, 0.0738, 0.7755, 0.4160, 0.8703, 0.2018, 0.7837, 0.4459])

可以看到x变化了，但是x的克隆x_cp没有变化，且形状通过view改变了。

2.4 Tensor、NumPy 和标量的互通

（1）item( )
另外一个常用的函数就是item( )，它可以将一个标量 Tensor 转换成一个 Python Number：

x = torch.randn(1)
print(x)
print(x.item())

输出：

tensor([-1.2288])
-1.2288298606872559

（2） Tensor 转 NumPy
可以用numpy( )函数直接转换：

a = torch.ones(5)
b = a.numpy()
print(type(a),type(b))  # a是tensor，b是numpy
print(a, b)

输出：

<class 'torch.Tensor'> <class 'numpy.ndarray'>
tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.]

（3） NumPy 转 Tensor
创建 Tensor 的另外一种方式，利用 NumPy 转 Tensor：

a = torch.ones(5)
b = a.numpy()
c = torch.from_numpy(b)
print(a, b, c)
print(type(a), type(b), type(c))

输出：

tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.] tensor([1., 1., 1., 1., 1.])
<class 'torch.Tensor'> <class 'numpy.ndarray'> <class 'torch.Tensor'>

可以看到c又变回了Tensor。
还应该注意的是，numpy( )和from_numpy( )产生的数组，共享相同的内存，因此改变其中一个时，另一个也会改变！

a = torch.ones(5)
b = a.numpy()
c = torch.from_numpy(b)
print(a, b, c)
print(type(a), type(b), type(c))
a += 1
print(a, b, c)

输出：

tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.] tensor([1., 1., 1., 1., 1.])
<class 'torch.Tensor'> <class 'numpy.ndarray'> <class 'torch.Tensor'>
tensor([2., 2., 2., 2., 2.]) [2. 2. 2. 2. 2.] tensor([2., 2., 2., 2., 2.])

另一种把Numpy数组转Tensor的方法则不然，torch.tensor( )方法会将数据进行拷贝，返回的Tensor和原来的数据不再共享内存：

a = torch.ones(5)
b = a.numpy()
c = torch.tensor(b)  # c先进行数据的拷贝，再转化为tensor
print(a, b, c)
print(type(a), type(b), type(c))
a += 1
print(a, b, c) # 发现a和b互通内存，但是c和它俩不再互通了

输出：

tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.] tensor([1., 1., 1., 1., 1.])
<class 'torch.Tensor'> <class 'numpy.ndarray'> <class 'torch.Tensor'>
tensor([2., 2., 2., 2., 2.]) [2. 2. 2. 2. 2.] tensor([1., 1., 1., 1., 1.])

2.5 线性代数相关函数

PyTorch支持一些线性代数相关的函数，可以直接使用，具体用法参考官方文档。

函数	功能
trace	对角线元素之和（即矩阵的迹）
diag	对角线元素
triu/tril	矩阵的上三角/下三角，可指定偏移量
mm/bmm	矩阵乘法，batch的矩阵乘法
addmm/addbmm/addmv/addr/baddbmm…	矩阵运算
t	转置
dot/cross	内积/外积
inverse	矩阵求逆
svd	奇异矩阵分解

Pytorch操作远不止以上这些，可以参考官方文档查看。

三、Tensor的广播机制

当两个形状相同的 Tensor 进行运算时，就是按元素运算。而当对两个形状不同的 Tensor 按元素运算时，会触发广播（Broadcasting）机制进行运算：先适当复制元素，使这两个Tensor形状相同后再按元素运算。例如：

x = torch.arange(1, 3).view(1, 2)
print(x)
y = torch.arange(1, 4).view(3, 1)
print(y)
pring(x + y)

输出：

tensor([[1, 2]])
tensor([[1],
        [2],
        [3]])
tensor([[2, 3],
        [3, 4],
        [4, 5]])

可以看到，x的第一行被复制了两份到第二行第三行，y的第一列被复制了一份到第二列，实际上是这样走的：

x = [1, 2] -> [1, 2]       y = [1],    ->   [1, 1]
              [1, 2]           [2],         [2, 2]
              [1, 2]           [3].         [3, 3]

x + y = [1 + 1 , 2 + 1],               [2, 3],
        [1 + 2 , 2 + 2],         =     [3, 4],
        [1 + 3 , 2 + 3].               [4, 5].

四、运算的内存开销

索引操作是不会开辟新内存的，而 y = x + y 是会新开内存的，将y指向了新的内存。为了证明这一点，我们可以用Python自带的 id 函数，如果两实例其 ID 一致，说明其所对应的内存地址相同；反之不同。

x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
id_before = id(y)
y = y + x
print(id(y) == id_before) # 查看当前y的ID是否和之前一致

输出：

False

如果想指定结果到原来的 y 的内存，我们可以用前面介绍的索引来进行替换操作。在下面的例子中，我们把 x + y 的结果通过[ : ]索引写进 y 对应的内存中，一个 [ : ]应该表示全部索引。

x = torch.tensor([[1, 2],[3, 4]])
y = torch.tensor([[3, 4],[5, 6]])
id_before = id(y)
y[:] = y + x
print(y)
print(id(y) == id_before) # 查看当前y的ID是否和之前一致

输出：

tensor([[ 4,  6],
        [ 8, 10]])
True

我们还可以使用运算符全名函数中的 out 参数或者自加运算符 += （也即 add_( )）达到上述效果，例如 torch.add(x, y, out=y)和
y += x 或 y.add_(x)。

x = torch.tensor([[1, 2],[3, 4]])
y = torch.tensor([[3, 4],[5, 6]])
id_before = id(y)
torch.add(x, y, out=y) # y += x, y.add_(x)，使用out参数指定输出到y里
print(y)
print(id(y) == id_before)

输出：

tensor([[ 4,  6],
        [ 8, 10]])
True

值得注意的是：虽然view返回的Tensor与源Tensor共享data，但是依然是一个新的Tensor，其除了包含data外还有一些其他属性，二者ID（内存地址）并不一致。

五、Tensor在CPU和GPU之间相互移动

使用方法 to( ) 可以将 Tensor 在 CPU 和 GPU 之间相互移动。

if torch.cuda.is_available():
    device = torch.device("cuda")          # GPU
    y = torch.ones_like(x, device=device)  # 直接创建一个在GPU上的Tensor
    x = x.to(device)                       # 等价于 .to("cuda")
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # to()还可以同时更改数据类型