PyTorch:全局函数

分布生成函数

设置seed

torch.manual_seed(0)

等差数列张量生成torch.arange

x = torch.arange(0, 12, 2)

tensor([ 0,  2,  4,  6,  8, 10])
x = x.reshape(2, 3)

tensor([[ 0,  2,  4],
        [ 6,  8, 10]])

torch.randint

torch.randint(low=0, high, size, ...)

Returns a tensor filled with random integers generated uniformly between low (inclusive) and high (exclusive). The shape of the tensor is defined by the variable argument size.

示例:生成[3-9]的,size=(2,2)的整数tensor

torch.randint(3, 10, (2, 2))
tensor([[4, 5],
        [6, 7]])

torch.randint_like(input, low=0, high, ...)

torch.randint_like(input, 0, 100)

torch.randn

torch.randn(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

outi​∼N(0,1)

示例

>>> torch.randn(4)
tensor([-2.1436,  0.9966,  2.3426, -0.6366])
>>> torch.randn(2, 3)
tensor([[ 1.5954,  2.8929, -1.0923],
        [ 1.1719, -0.4709, -0.1996]])

# 生成一个4*3*2维的张量

input = torch.randn(4, 3, 2)

torch.normal

torch.normal(mean, std, *, generator=None, out=None) → Tensor

这种生成正态分布数据的张量创建有4种模式:

(1)mean为张量,std为张量

(2)mean为标量,std为标量

(3)mean为标量,std为张量

(4)mean为张量,std为标量

[从零开始深度学习Pytorch笔记(3)——张量的创建(下)]

torch.normal(mean, std, size, *, out=None) → Tensor
torch.normal(2, 3, size=(1, 4))
tensor([[-1.3987, -1.9544,  3.6048,  0.7909]])

[TORCH.NORMAL]

torch.zeros

可以指定dtype。

torch.zeros((1, 0), dtype=torch.float32)

torch.ones

torch.ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)

torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

torch.ones_like

torch.ones_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format)

Returns a tensor filled with the scalar value 1, with the same size as input. torch.ones_like(input) is equivalent to torch.ones(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).

input = torch.empty(2, 3)
>>> torch.ones_like(input)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])


torch.eye

torch.eye(n, m=None, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)

Returns a 2-D tensor with ones on the diagonal and zeros elsewhere.

参数
n (int) – the number of rows

m (int, optional) – the number of columns with default being n

示例

torch.eye(3)
tensor([[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]])

输出相关

onehot和multihot编码

sklearn实现

可以通过sklearn编码后再换

sklearn实现时,需要注意在init时定义encoder并保存,否则train和predict时可能编码对应的还不一样,因为原始label不是被当成从0开始且连续的,而是当成离散的无序的。

[onehot - Scikit-learn:数据预处理Preprocessing data]

onehot编码

nn.functional.one_hot

torch.nn.functional.one_hot(tensor, num_classes=-1)

参数Parameters

tensor (LongTensor) – class values of any shape.

num_classes (int) – Total number of classes. If set to -1, the number of classes will be inferred as one greater than the largest class value in the input tensor.

自动检测类别个数

import torch.nn.functional as F
import torch

tensor =  torch.arange(0, 5) % 3  # tensor([0, 1, 2, 0, 1])
one_hot = F.one_hot(tensor)

# 输出:
# tensor([[1, 0, 0],
#         [0, 1, 0],
#         [0, 0, 1],
#         [1, 0, 0],
#         [0, 1, 0]])

F.one_hot会自动检测不同类别个数,生成对应独热编码。

指定类别数

tensor =  torch.arange(0, 5) % 3  # tensor([0, 1, 2, 0, 1])
one_hot = F.one_hot(tensor, num_classes=5)

# 输出:
# tensor([[1, 0, 0, 0, 0],
#         [0, 1, 0, 0, 0],
#         [0, 0, 1, 0, 0],
#         [1, 0, 0, 0, 0],
#         [0, 1, 0, 0, 0]])

示例:生成onehot

两种方法都可以,只是类型不一样

import torch
import torch.nn.functional as F

label_size = 5
target = 3
one_hot0 = F.one_hot(torch.tensor(target), label_size)
print(one_hot0)
one_hot1 = torch.eye(label_size)[target]
print(one_hot1)
# tensor([0, 0, 0, 1, 0])
# tensor([0., 0., 0., 1., 0.])

multihot编码

import torch.nn.functional as F
import torch

tensor = torch.tensor([[0, 2], [2, 2]])

下面两种方法都需要先将tensor内的元素对齐成一样长度。

对齐后直接实现

multi_hot = torch.zeros(2, 4).scatter_(1, tensor, 1)
print(multi_hot)
# tensor([[1., 0., 1., 0.],
#         [0., 0., 1., 0.]])

import torch
def get_multi_hot_label(batchs, label_size):
    # 先对齐
    batch_size = len(batchs)
    max_label_num = max([len(x) for x in batchs])
    doc_labels_extend = [[doc_label[0] for _ in range(max_label_num)] for doc_label in batchs]
    for i in range(0, batch_size):
        doc_labels_extend[i][0: len(batchs[i])] = batchs[i]
    y = torch.Tensor(doc_labels_extend).long()
    # 再变成multihot
    multihot_tensor = torch.zeros(batch_size, label_size).scatter_(1, y, 1)
    return multihot_tensor


print(get_multi_hot_label([[0, 1], [2]], label_size=3))
# tensor([[1., 1., 0.],
#         [0., 0., 1.]])

对齐后通过onehot间接实现

one_hot = F.one_hot(tensor, num_classes=4).sum(dim=-2)
print(one_hot)
# tensor([[1, 0, 1, 0],
#         [0, 0, 1, 0]])

不对齐直接生成multihot

import torch
def get_multi_hot_label(batchs, label_size):
    multihot_list = [[1 if i in batch else 0 for i in range(label_size)] for batch in batchs]
    return torch.Tensor(multihot_list)

print(get_multi_hot_label([[0, 1], [2]], label_size=3))
# tensor([[1., 1., 0.],
#         [0., 0., 1.]])

 

torch.sigmoid

这里只提一个精度问题:

torch.sigmoid(torch.Tensor([10]))  = torch.sigmoid(torch.Tensor([-89])) = tensor([1.0000])

超出-89和10之外的,都是极值。

[TORCH.SIGMOID]

torch.topk

torch.topk(input, k, dim=None, largest=True, sorted=True, *, out=None)

求tensor中某个dim的前k大或者前k小的值以及对应的index。

If dim is not given, the last dimension of the input is chosen.

示例

x = torch.arange(0, 12, 2).reshape(2, 3)
print(torch.topk(x, 2, -1))

tensor([[ 0,  2,  4],
        [ 6,  8, 10]])
torch.return_types.topk(
values=tensor([[ 4,  2],
        [10,  8]]),
indices=tensor([[2, 1],
        [2, 1]]))

torch.log

torch.log(input, *, out=None)

Returns a new tensor with the natural logarithm of the elements of input.

y_i​=log_e​(x_i​)

torch.exp

torch.exp(input, *, out=None)

Returns a new tensor with the exponential of the elements of the input tensor input.

y_i​=e^(x_i)

资源使用相关

pytorch将线程数设置成1

def SetUpPytroch():
    torch_thread_num = int(os.getenv("TORCH_THREAD_NUM", "1"))
    if torch_thread_num != -1:
        torch.set_num_threads(torch_thread_num)
        torch.set_num_interop_threads(torch_thread_num)
Note: pytorch内部是线程池,观察时可能还是会看到多个线程。

Pytorch代码资源使用解析

torch.profiler.profile(*, activities=None, schedule=None, on_trace_ready=None, record_shapes=False, profile_memory=False, with_stack=False, with_flops=False, with_modules=False, experimental_config=None, use_cuda=None)

with torch.profiler.profile(
    activities=[
        torch.profiler.ProfilerActivity.CPU,
        torch.profiler.ProfilerActivity.CUDA,
    ]
) as p:
    code_to_profile()
print(p.key_averages().table(
    sort_by="self_cuda_time_total", row_limit=-1))

老版本:

torch.autograd.profiler.profile(enabled=True, use_cuda=False, record_shapes=False)

x = torch.randn((1, 1), requires_grad=True)
with torch.autograd.profiler.profile() as prof:
for _ in range(100):  # any normal python code, really!
  y = x ** 2
  y.backward()
# NOTE: some columns were removed for brevity
print(prof.key_averages().table(sort_by="self_cpu_time_total"))

[https://www.cnblogs.com/jiangkejie/p/13256094.html]

from: -柚子皮-

ref:

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值