学习Pytorch语法中遇到的问题

最新推荐文章于 2024-07-11 23:31:47 发布

chagelo

最新推荐文章于 2024-07-11 23:31:47 发布

阅读量161

点赞数

分类专栏：机器学习文章标签：广播 pytorch

本文链接：https://blog.csdn.net/UoweMee/article/details/119767587

版权

机器学习专栏收录该内容

6 篇文章 1 订阅

订阅专栏

广播

# -*- coding: utf-8 -*-
import torch
import math

# x'shape is [2000,]
x = torch.linspace(-math.pi, math.pi, 2000)
y = torch.sin(x)

p = torch.tensor([1, 2, 3])
xx = x.unsqueeze(-1).pow(p)
# In the above code, x.unsqueeze(-1) has shape (2000, 1), and p has shape
# (3,), for this case, broadcasting semantics will apply to obtain a tensor
# of shape (2000, 3)

下面是一个广播的例子
在这里插入图片描述

函数

损失函数

损失函数的参数基本一样

比如对于nn.MSELoss函数，其中的reduction函数；

import torch
import torch.nn as nn
 
a = torch.tensor([[1, 2], [3, 4]], dtype=torch.float)
b = torch.tensor([[3, 5], [8, 6]], dtype=torch.float)
 
loss_fn1 = torch.nn.MSELoss(reduction='none')
loss1 = loss_fn1(a.float(), b.float())
print(loss1)   # 输出结果：tensor([[ 4.,  9.],
               #                 [25.,  4.]])
 
loss_fn2 = torch.nn.MSELoss(reduction='sum')
loss2 = loss_fn2(a.float(), b.float())
print(loss2)   # 输出结果：tensor(42.)
 
 
loss_fn3 = torch.nn.MSELoss(reduction='mean')
loss3 = loss_fn3(a.float(), b.float())
print(loss3)   # 输出结果：tensor(10.5000)

Dataset和DataLoader存在的意义

Code for processing data samples can get messy and hard to maintain; we ideally want our dataset code to be decoupled from our model training code for better readability and modularity. PyTorch provides two data primitives: torch.utils.data.DataLoader and torch.utils.data.Dataset that allow you to use pre-loaded datasets as well as your own data. Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.

当对DataLoader进行遍历时，我们一般都会使用enumerate；

# x is index, y commnly is a list including several tensors
for x, y in enumerate(train_DataLoader):
	print(x, y)
# x is index, y and z commnly are two tensors
for x, (y, z) in enumerate(train_DataLoader):
	print(x, y, z)

dataset的自定义实现中需要实现__getitem__函数，该函数返回feature和label，在此之前通过文件路径得到数据，其中的feature维度一般是2维，但是其中的label的维度必须是1维，不然进行训练的时候会报错（否则通过index返回的依然是二维的label，在进行计算loss的时候model的输出是一维）；
对于一个分类的问题，比如数字图片的分类识别；其中的loss函数，分别有一个input和target，比如图片有10个类（不一定是0-9，也可能是10-19）那么label范围就必须是[1,10]
dataset里label数据必须是LongTensor类型

自定义dataset

class ImagDataset(Dataset):
  def __init__(self):
    path = '/content/drive/MyDrive/colab/data/NUMIMG/ex4data1.mat'
    data = loadmat(path)
    self.data_x = torch.from_numpy(data['X']).float()
    self.data_y = torch.from_numpy(data['y']).long().reshape(-1) - 1
    self.sample_num = len(data['X'])
  def __getitem__(self,index):
    return self.data_x[index], self.data_y[index]
  def __len__(self):
    return self.sample_num

dataset = ImagDataset()
dataloader = DataLoader(dataset=dataset, batch_size=32,shuffle=True)

dataloader的shuffle

经过实践发现，shuffl的结果就是每个batch内部、batch之间都会被打乱。每个epoch都会进行一次shuffle。

自动求导

x = torch.arrange(4)
y = x * x
# 这里就相当与把u看成了一个常数
u = y.detach()
z = u * x
z.sum().backward()
print(x.grad == u)
# [true, true, true, true]

torch.flatten()&nn.Flatten()

torch.flatten(input, start_dim=0, end_dim=-1) → Tensor

将input从start_dim到end_dim之前展平，start_dim等于end_dim什么都不做。

torch.nn.Flatten(start_dim=1, end_dim=-1)，注意该函数的start_dim为1

上下采样

1.torch.nn.Upsample(size=None, scale_factor=None, mode='nearest', align_corners=None)
2.torch.nn.UpsamplingNearest2d(size=None, scale_factor=None)
3.torch.nn.functional.interpolate(input, size=None, scale_factor=None, mode='nearest', align_corners=None, recompute_scale_factor=None)
其中size表示输出的size

ref:https://www.cnblogs.com/wanghui-garcia/p/11399053.html

interpolate function

transpose&permute&view&contiguous

ref :https://blog.csdn.net/xinjieyuan/article/details/105232802

torch.clamp

torch.clamp(input, min, max, out=None) → Tensor

对于input，将其中小于min的值变为min，大于max的值变为max

nn.embedding

torch.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, _weight=None, device=None, dtype=None)

num_embeddings是词典的大小，同时embedding输入tensor中数值不能超过这个值。embedding_dim表示每个词对应向量的维度。输入表示的是词典中词的下标。

ref:https://discuss.pytorch.org/t/how-does-nn-embedding-work/88518/3

nn.embedding实际上就是nn.layer的变形。

chagelo

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
学习Pytorch语法中遇到的问题

广播# -*- coding: utf-8 -*-import torchimport math# x'shape is [2000,]x = torch.linspace(-math.pi, math.pi, 2000)y = torch.sin(x)p = torch.tensor([1, 2, 3])xx = x.unsqueeze(-1).pow(p)# In the above code, x.unsqueeze(-1) has shape (2000, 1), and p
复制链接

扫一扫