pytorch学习笔记5--pytorch基本运算

最新推荐文章于 2024-03-09 22:35:01 发布

jeffery0628

最新推荐文章于 2024-03-09 22:35:01 发布

阅读量6.4k

点赞数

分类专栏： pytorch

本文链接：https://blog.csdn.net/code_fighter/article/details/91613400

版权

pytorch 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

文章目录

- 基本运算

基本运算

add/minus/multiply/divide

matmul

注意：torch.mm：仅适用于2d矩阵相乘，不建议使用，建议使用matmul来计算矩阵乘法

* :表示相同位置上元素相乘
matmul（@）:表示矩阵相乘

a = torch.rand(4,3,28,64)
b = torch.rand(4,3,64,32)
torch.matmul(a,b).shape # 只计算最后两维的乘积
:torch.Size([4,3,28,32])

a = torch.rand(4,3,28,64)
b = torch.rand(4,1,64,32)  
torch.matmul(a,b).shape # 有broadcasting 操作
:torch.Size([4,3,28,32])

基本函数

使用pow，其作用等价于： a ** num
sqrt等价于pow(a,1/2)
rsqrt是对平方根求导
exp()
log()
tensor.floor():向下取整
tensor.ceil():向上取整
tensor.trunc():把一个浮点数裁剪出整数部分
tensor.frac()：把一个浮点数裁剪出小数部分
tensor.round():四舍五入
tensor.clamp():裁剪
1. tensor.clamp(min):对于小于tensor中的数如果小于min就修改成min
2. tensor.clamp(min,max):对于tensor中的数值，如果大于max都修改成max

统计属性

norm

norm vs normalize and batch_norm是有区别的：norm是范数的意思，normalize、batch_norm是归一化
matrix norm 和vector norm 是有区别的

a = torch.full([8],1)
b = a.view(2,4)
c = a.view(2,2,2)
a.norm(1) # a tensor 的一范式
： tensor(8.)
b.norm(1)
: tensor(8.)
c.norm(1)
: tensor(8.)

b.norm(2) # b tensor 的二范式 
: tensor(2.8284)
b.norm(1,dim=1)
:tensor(4.,4.)

mean,sum,min,max,prod,

对于argmin,argmax：如果不给出固定的dimension，会把tensor打平成dim=1，然后返回最小、最大的索引。

a = torch.arange(8).view(2,3).float()
: tensor([[0,1,2,3],
            [4,5,6,7]])
a.min(),a.max(),a.mean(),a.prod(),a.sum(),a.argmin(),a.argmax()
:tensor(0.),tensor(7.),tensor(3.5000),tensor(0.),tensor(28.),tensor(0),tensor(7)

argmin,argmax

a = torch.rand(4,10)
: tensor([[0.4992, 0.4095, 0.5239, 0.8184, 0.3184, 0.6433, 0.2028, 0.1133, 0.6991,0.3260],
        [0.1473, 0.2765, 0.1476, 0.2192, 0.8490, 0.7610, 0.0072, 0.6767, 0.1496, 0.2772],
        [0.0691, 0.4229, 0.6794, 0.9665, 0.3935, 0.9259, 0.3509, 0.6875, 0.8682, 0.0592],
        [0.2496, 0.3506, 0.8447, 0.2141, 0.4849, 0.2772, 0.3786, 0.6603, 0.8913, 0.1118]])
     
a.max(dim=1)
:(tensor([0.8184, 0.8490, 0.9665, 0.8913]), tensor([3, 4, 3, 8]))

a.argmax(dim=1)
tensor([3, 4, 3, 8])

a.max(dim=1,keepdim=True) # 希望结果的维度（dim）和a保持一致
:(tensor([[0.8184],
         [0.8490],
         [0.9665],
         [0.8913]]), tensor([[3],
         [4],
         [3],
         [8]]))
         
 a.argmax(dim=1,keepdim=True)
 :tensor([[3],
        [4],
        [3],
        [8]])

kthvalue,topk

kthvalue 第几小的值

a = torch.rand(4,10)
: tensor([[0.4992, 0.4095, 0.5239, 0.8184, 0.3184, 0.6433, 0.2028, 0.1133, 0.6991,0.3260],
        [0.1473, 0.2765, 0.1476, 0.2192, 0.8490, 0.7610, 0.0072, 0.6767, 0.1496, 0.2772],
        [0.0691, 0.4229, 0.6794, 0.9665, 0.3935, 0.9259, 0.3509, 0.6875, 0.8682, 0.0592],
        [0.2496, 0.3506, 0.8447, 0.2141, 0.4849, 0.2772, 0.3786, 0.6603, 0.8913, 0.1118]])

a.topk(3,dim=1)
:(tensor([[0.8184, 0.6991, 0.6433],
         [0.8490, 0.7610, 0.6767],
         [0.9665, 0.9259, 0.8682],
         [0.8913, 0.8447, 0.6603]]), tensor([[3, 8, 5],
         [4, 5, 7],
         [3, 5, 8],
         [8, 2, 7]]))

a.topk(3,dim=1,largest=False):概率最小的几个
:(tensor([[0.1133, 0.2028, 0.3184],
         [0.0072, 0.1473, 0.1476],
         [0.0592, 0.0691, 0.3509],
         [0.1118, 0.2141, 0.2496]]), tensor([[7, 6, 4],
         [6, 0, 2],
         [9, 0, 6],
         [9, 3, 0]]))
         
a.kthvalue(8,dim=1)# 第八小的值，在这里就是第三大的值
:(tensor([0.6433, 0.6767, 0.8682, 0.6603]), tensor([5, 7, 8, 7]))

a.kthvalue(8)
:(tensor([0.6433, 0.6767, 0.8682, 0.6603]), tensor([5, 7, 8, 7]))

>,>=,<,<=,!=,==

a>0 返回的是对应的mask
a>0 等价于torch.ge(a,0)
torch.eq(a,b),比较a,b中的每个元素的值，返回一个mask
torch.equal(a,b) 比较a,b中每个值，返回True or False

高阶操作(GPU)

where

torch.where(condition,a,b)->tensor c:c中数值的来源于：a，b

cond = torch.rand(2,2)
:tensor([[0.9019, 0.2225],
        [0.4002, 0.4745]])
        
a = torch.zeros(2,2)
:tensor([[0., 0.],
        [0., 0.]])

b = torch.ones(2,2)
:tensor([[1., 1.],
        [1., 1.]])
        
torch.where(cond>0.5,a,b)# 免去了for循环的嵌套，可以在gpu上运行
:tensor([[0., 1.],
        [1., 1.]])

Gather

torch.gather(input,dim,index,out=None) -> Tensor
input:表示要查的表
dim：对input查找的维度
index：查找的索引值

prob = torch.randn(4,10)
idx = prob.topk(dim=1,k=3)
:(tensor([[1.3720, 1.0751, 1.0114],
         [2.3205, 0.9811, 0.5586],
         [1.1462, 0.9951, 0.9102],
         [1.9489, 0.9159, 0.7970]]), tensor([[9, 3, 2],
         [9, 4, 5],
         [0, 2, 9],
         [3, 8, 6]]))
 
 label = torch.arange(10)+100
 :tensor([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])
 
 idx = idx[1]
 :tensor([[9, 3, 2],
        [9, 4, 5],
        [0, 2, 9],
        [3, 8, 6]])

torch.gather(label.expand(4,10),dim=1,index=idx.long())
:tensor([[109, 103, 102],
        [109, 104, 105],
        [100, 102, 109],
        [103, 108, 106]])

梯度

梯度是一个向量
在这里插入图片描述

how to search for minima?
$\theta_{t+1} = \theta_t - \alpha_t * \Delta * f(\theta_t)$
function:
$J(\theta_1,\theta_2)=\theta_1^2+\theta_2^2$
objective:
$min_{\theta_1,\theta_2}J(\theta_1,\theta_2)$
Update rules:
$\theta_1 := \theta_1 - \alpha \frac{d}{d \theta_1}J(\theta_1,\theta_2)$
$\theta_2 := \theta_2 - \alpha \frac{d}{d \theta_2}J(\theta_1,\theta_2)$
derivatives:
$\frac{d}{d \theta_1}J(\theta_1,\theta_2)=\frac{d}{d \theta_1}\theta_1^2+\frac{d}{d \theta_1}\theta_2^2=2\theta_1$
$\frac{d}{d \theta_2}J(\theta_1,\theta_2)=\frac{d}{d \theta_2}\theta_1^2+\frac{d}{d \theta_2}\theta_2^2=2\theta_2$
鞍点和局部极小值会影响到搜索最小值
optimizer
1. initialization status
2. learning rate （stepLR）
3. momentum
4. etc

loss

Mean Squared Error(MSE)：

1. MSE:$loss = \sum(y-y^-)^2$
2. L2_norm = $||y-y^-||_2$ : $\sqrt{\sum{(y-y^-)}^2}$,注意有一个开根号的过程
3. torch.norm((y-pred),2),是开过根号的

自动求导：torch.autograd.grad[loss,[w1,w2,…]]和loss.backward()

x = torch.ones(1)
w = torch.full([1],2)
w.required_grad_()# 设置w可以求导
mse = F.mse_loss(torch.ones(1),x * w)
torch.auto.grad.grad(mse,[w])# mse 代表损失，[w]表示哪些参数要求导数。
：（tensor([2.])）

mse.backward() # 也可以这样计算导数，因为网络图已经记住计算导数的路径了,并且该函数把计算出来的导数保存到对应需要梯度的变量上，通过tensor.grad 来获取导数值
w.grad
:tensor([2.])

Cross Entropy Loss

注意：softmax可以让所有变量的概率和为1

sigmoid : $\frac{1}{1+e^{-x}}$
softmax : $S(y_i) = \frac{e^{y_i}}{\sum_j{e^{y_j}}}$
logits scores:[2.0,1.0,0.1],需要经过softmax函数，转变成概率值[0.7,0.2,0.1]
对softmax函数求导： $p_i$ 是对应概率值， $a_j$ 是参数

a = torch.rand(3)
a.requires_grad_()
p = F.softmax(a,dim=0)
# p.backward()# 如果后面还需要在使用torch.backward(),需要设置retain_graph=True
torch.autograd.grad(p[1],[a],retain_graph=True)
:(tensor([-0.1143,  0.2311, -0.1168]),)

1. binary
2. multi-class
3. +softmax
4. leave it to Logistic Regression Part

jeffery0628

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录