Pytorch关于张量维度的一些总结

齐落山大勇

已于 2023-03-24 14:18:29 修改

阅读量546

点赞数

分类专栏：深度学习（PyTorch）文章标签： pytorch 深度学习人工智能

于 2023-02-24 16:37:39 首次发布

本文链接：https://blog.csdn.net/qq_42982824/article/details/129203272

版权

深度学习（PyTorch）专栏收录该内容

16 篇文章 0 订阅

订阅专栏

在处理深度学习时，张量维度的理解和运算非常的关键，在此归纳总结遇到的集中典型情况。

1、squeeze()和unsqueeze()

×为了实现一个4×3×128×128的张量与4×1×128×128的张量求积，需要使用squeeze()和unsqueeze()进行维度的变换。

import torch
#bachsize=4 ,chanel =3, width=128, hight=128
input = torch.randn(4,3,128,128)
#将tensor深拷贝一份
output_masked = input.clone()
output = torch.randn(4,1,128,128)

#input[:, 0, :, :]的维度是（4，128，128）
print(input[:, 0, :, :].shape)

temp = input[:, 0, :, :].unsqueeze(1)
temp1 = temp* output
print("temp1.shape:{}".format(temp1.shape))
temp2 =temp1.squeeze()
print("temp2.shape:{}".format(temp2.shape))

#input[:, 0, :, :]的维度是（4，128，128），必须扩充第一维才能相乘
output_masked[:, 0, :, :] = x.unsqueeze(1) * output).squeeze()

运行的结果如下：

最后一行：

#input[:, 0, :, :]的维度是（4，128，128），必须扩充第一维才能相乘

2、张量的拼接：torch.cat 和torch.stack

cat只是简单的拼接，不产生新的维度

x1=torch.rand(2,2)
x2=torch.rand(2,2)
print(x1)
print(x2)
y =torch.cat((x1,x2),dim=0)
print(y)
print(y.shape)
y =torch.cat((x1,x2),dim=1)
print(y)
print(y.shape)

运行结果如下：

stack操作会产生新的维度

x1=torch.rand(2,3)
x2=torch.rand(2,3)
print(x1)
print(x2)
y =torch.stack((x1,x2),dim=0)
print(y)
print(y.shape)
y =torch.stack((x1,x2),dim=1)
print(y)
print(y.shape)

运行结果如下：

当两个张量做stack拼接时，第dim维度会变成2，新增加有一个维度。

对于（2，3)的张量，在dim=0维度做拼接，就是将原张量的第0维（第1个“[]”中的内容）全部取出，拼接在一起，组成新的第0维

同理，dim=1，就是将第2个“[]”的内容取出，组成新的第1维。

3、张量维度的交换：.permute（）

img= torch.rand(2,3,3)
print(img)
print(img.shape)
img2 = img.permute(1, 0, 2)
print(img2)
print(img2.shape)

permute(0,2,1)相当于第1、第2维互换，相当于转置。

permute(2,1,0),相当于第0维与第2维互换，（2，3，3）--->(3,2,3)

4、张量维度的索引

比如在图像分割的过程中，需要计算图像的IOU就会用到tensor的索引，有时会存在连续索引的情况。

首先产生两个（2，1，3，3）的张量

x=torch.randn(2,1,3,3)
y=torch.randn(2,1,3,3)
x[x>0.5 ]=1
x[x<0.5 ] =0

y[y>0.5 ]=1
y[y<0.5 ] =0
print(x)
print(y)

效果如下：

开始索引

#首先将张量转成数组
x_np =x.data.cpu().numpy()
y_np =y.data.cpu().numpy()

for i in range(4):
    s=np.sum(x_np[i][y_np[i]==1])
#x_np[0]的维度是（1，3，3）
print(x_np[0].shape)
#给x_np中的特定元素赋值并显示
x_np[0,0,0,2]=3
print(x_np[0])

#可以多次使用[]进行索引，x_np[0]得到的是一个（1，3，3）的张量，
#可以继续使用[y_np[0]==1]]产生一个数组进一步索引。

print(x_np[0][y_np[0]==1])

运行结果如下：

再举一个例子：

#图像IOU的计算,交集/并集
#input_val:(4,1,128,128)
#gt_val:(4,1,128,128)
#pred_np:(4,1,128,128)
#会针对4张图片，循环4次


for x in range(input_val.size()[0]):
#pred_np[0]:(1,128,128)
                IoU = np.sum(pred_np[x][gt_val[x]==1]) / float(np.sum(pred_np[x]) + np.sum(gt_val[x]) - np.sum(pred_np[x][gt_val[x]==1]))
                dice = np.sum(pred_np[x][gt_val[x]==1])*2 / float(np.sum(pred_np[x]) + np.sum(gt_val[x]))
                IoUs.append(IoU)
                dices.append(dice)