PyTorch Tensor 创建

jiang_huixin

已于 2023-08-02 16:40:32 修改

阅读量1.5k

点赞数

分类专栏： PyTorch 文章标签： pytorch 深度学习 python

于 2023-05-30 22:33:59 首次发布

本文链接：https://blog.csdn.net/jiang_huixin/article/details/130958620

版权

PyTorch 专栏收录该内容

10 篇文章 3 订阅

订阅专栏

使用已有数据

torch.tensor(data)

>>> import torch
>>> import numpy as np

# 标量
>>> torch.tensor(0)
tensor(0)

# 列表/元组
>>> torch.tensor([[1.0]])
tensor([[1.]])

# ndarray
>>> n = np.arange(3)
>>> torch.tensor(n)
tensor([0, 1, 2])

可以额外指定数据类型和设备类型, 默认情况下由数据本身自动推断出类型, 整数使用 torch.int64 类型, 浮点数使用 torch.float32 类型

# 指定数据类型和设备
>>> torch.tensor([1.0, 2.0], dtype=torch.float16, device='cuda')
tensor([1., 2.], device='cuda:0', dtype=torch.float16)

使用 torch.tensor(data) 创建 Tensor 时, 总是完全拷贝, 不共享底层的内存数据

# 不共享 ndarray 内存数据
>>> n = np.array([1, 2])
>>> t = torch.tensor(n)
# 修改 ndarray
>>> n[0] = 0
>>> n
array([0, 2])
# Tensor 不受影响
>>> t
tensor([1, 2])

torch.as_tensor(data)

torch.as_tensor(data) 函数会尽量共享内存数据, 只有当 data 是 np.ndarray 或 torch.Tensor 类型时才能共享, data 是列表或元组时没法共享内存

# 共享 ndarray 的内存数据
>>> n = np.array([1, 2])
# 底层调用 torch.from_numpy(n)
>>> t = torch.as_tensor(n)
>>> n[0] = 0
>>> n
array([0, 2])
# Tensor 同步变化
>>> t
tensor([0, 2])

在导入数据的同时, 可以指定数据类型和设备, 当指定的类型和 data 类型不一致时不共享内存数据

>>> old = torch.arange(1)
# 导入时更改数据类型
>>> t = torch.as_tensor(old, dtype=torch.int8)
>>> old[0] = 1
>>> old
tensor([1])
# 导入后的数据不受影响
>>> t
tensor([0], dtype=torch.int8)

torch.Tensor(sequence)

相当于直接实例化一个 torch.Tensor 类型的对象, 默认是 torch.float32 的数据类型, 存储于 CPU 设备, 数据必须是 sequence, 不支持标量

# 不支持标量
>>> torch.Tensor(1.0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: new(): data must be a sequence (got float)
>>> n = np.array([0, 1])
# 整型被自动转为 torch.float32
# 为避免数据类型转换, 可使用 torch.LongTensor(n)
>>> torch.Tensor(n)
tensor([0., 1.])

除了 torch.Tensor, 还有其他的张量类型, 例如 torch.IntTensor, torch.FloatTensor 等 CPU 张量类型, torch.cuda.IntTensor, torch.cuda.FloatTensor 等 GPU 张量类型, 这些张量类型也可以被实例化

张量类型包括数据类型和设备类型(CPU 与 GPU), torch.Tensor 默认指向 torch.FloatTensor(可以全局修改), 其数据类型为 torch.float32, 设备类型为 CPU

# 使用 CPU 张量类型进行实例化
>>> torch.DoubleTensor([1, 2])
tensor([1., 2.], dtype=torch.float64)

# 使用 GPU 张量类型进行实例化
>>> torch.cuda.IntTensor([1, 2])
tensor([1, 2], device='cuda:0', dtype=torch.int32)

数据未初始化

没有初始数据, 仅开辟存储空间, 其值为存储设备中的脏数据

torch.empty(*sizes)
torch.Tensor(*sizes)
- 注意与 torch.Tensor(sequence) 的区别, 这里的 *sizes 为位置参数, 且参数值为整数, 表示张量的维度
- 这里也可以替换为 torch.IntTensor(*sizes) 等 Tensor 类型

# 1 行 2 列
>>> torch.empty(1, 2)
tensor([[2.9386e+29, 7.1104e-04]])

# 2 行 3 列
>>> torch.cuda.IntTensor(2, 3)
tensor([[0, 0, 0],
        [0, 0, 0]], device='cuda:0', dtype=torch.int32)

特殊张量

全 1, 0, value 的张量

以下函数创建所有元素的值为 1, 0, fill_value 的张量:

torch.ones(*size)
torch.zeros(*size)
torch.full(size, fill_value)

对角矩阵

torch.eye(n, m=None)

n 行 m 列矩阵, 主对角线为 1, 其他元素均为 0

torch.diag(tensor)

n 行 n 列的对角矩阵, n 为一维张量 tensor 的长度, 主对角线的元素为 tensor 对应的元素, 其余元素为 0

代码示例:

# 默认数据类型为 torch.float32, 设备类型为 CPU
>>> torch.zeros(2)
tensor([0., 0.])

# 2 行 1 列
>>> torch.ones(2, 1, dtype=torch.int32)
tensor([[1],
        [1]], dtype=torch.int32)

# 指定全值
>>> torch.full([2, 2], 3)
tensor([[3, 3],
        [3, 3]])

# 主对角线元素均为 1, 其余元素为 0
>>> torch.eye(2, 3)
tensor([[1., 0., 0.],
        [0., 1., 0.]])

>>> torch.diag(torch.tensor([1, 3, 5]))
tensor([[1, 0, 0],
        [0, 3, 0],
        [0, 0, 5]])

数列

torch.arange(start=0, end, step=1)

类似于 Python 的 range, 不包含 end

torch.linspace(start, end, steps)

从 start 到 end 均匀选取 steps 个数值

$(\rm{start,start+\frac{end-start}{steps-1}, \cdots,start+(steps-2)*\frac{end-start}{steps-1},end})$

torch.logspace(start, end, steps, base=10.0)

$\rm(base^{start},base^{(start+\frac{end-start}{steps-1})},\cdots,base^{(start+(steps-2)*\frac{end-start}{steps-1})},base^{end})$

代码示例:

>>> torch.arange(3)
tensor([0, 1, 2])
>>> torch.arange(1, 3.1, 1.0)
tensor([1., 2., 3.])

>>> torch.linspace(-2, 2, 5)
tensor([-2., -1.,  0.,  1.,  2.])

# 2^(-2), 2^(-1), 2^(0), ... 2^2
>>> torch.logspace(-2, 2, steps=5, base=2)
tensor([0.2500, 0.5000, 1.0000, 2.0000, 4.0000])

可以看出对于相同的 start, end 和 steps 参数, logspace = base ^ linspace

随机生成

正态分布

# 标准正态分布
torch.randn(*size)

# 指定均值与标准差
torch.normal(mean, std, size)

示例

# 指定随机数种子, 保证随机数可以重现
>>> _ = torch.manual_seed(2022)
>>> torch.randn(2, 3)
tensor([[ 0.1915,  0.3306,  0.2306],
        [ 0.8936, -0.2044, -0.9081]])
>>> torch.normal(mean=1.0, std=0.1, size=[2, 3])
tensor([[0.7689, 1.1635, 1.2061],
        [0.9746, 0.8488, 0.8720]])
# 不指定size, 由 mean 和 std 参数的形状推断出结果的维度
# 输出的两个随机数分别服从均值为 1.0 和 2.0 标准差为 0.1 的正态分布
# 显然, 两个数分别在 1.0 和 2.0 的附近(标准差故意选的很小)
>>> torch.normal(mean=torch.Tensor([1.0, 2.0]), std=0.1)
tensor([1.0111, 2.0205])

均匀分布

# [0, 1] 上的均匀分布
torch.rand(*size)

示例

# [2, 4] 上的均匀分布
>>> 2 * torch.rand(2, 2) + 2
tensor([[2.4388, 2.5786],
        [3.3569, 2.9994]])

随机整数

# 随机生成 low 到 high - 1 的整数, 包括 low 和 high - 1 这两个整数
torch.randint(low=0, high, size)

示例

>>> torch.randint(5, [2, 3])
tensor([[1, 0, 3],
        [1, 4, 2]])
>>> torch.randint(3, 6, [2, 2])
tensor([[5, 4],
        [4, 3]])

随机序列

# 0, 1, 2, ..., n-1 随机排列
torch.randperm(n)

示例

>>> _ = torch.manual_seed(2022)
>>> torch.randperm(6)
tensor([5, 1, 3, 2, 0, 4])

继承张量类型

使用 t.new_*() 的方式新建一个张量, 该张量与 t 张量具有相同的张量类型(包含数据类型和设备类型)

例如

# 未初始化
Tensor.new(*sizes)
Tensor.new_empty(size)

# 全 0
Tensor.new_zeros(size)

# 全 1
Tensor.new_ones(size)

# 指定初始值
Tensor.new_full(size, fill_value)

示例

>>> t = torch.cuda.IntTensor([2])
>>> t
tensor([2], device='cuda:0', dtype=torch.int32)
# 继承了数据类型以及设备类型
>>> t.new_full([1, 2], 1)
tensor([[1, 1]], device='cuda:0', dtype=torch.int32)

继承张量类型和维度

使用 torch.*_like(other) 的方式新建一个张量, 该张量与 other 张量具有相同的形状和张量类型

例如

# 未初始化
torch.empty_like(other)

# 全 0
torch.zeros_like(other)

# 全 1
torch.ones_like(other)

# 指定初始值
torch.full_like(other, fill_value)

# 均匀分布
torch.rand_like(other)

# 标准正态分布
torch.randn_like(other)

# 随机整数
torch.randint_like(other, low=0, high)

示例

>>> t = torch.tensor([[1, 2]], dtype=torch.int16, device='cuda')
>>> t
tensor([[1, 2]], device='cuda:0', dtype=torch.int16)
>>> f = torch.zeros_like(t)
# 继承了 t 的形状以及张量类型
>>> f
tensor([[0, 0]], device='cuda:0', dtype=torch.int16)