目录
1. view (返回视图)
功能是返回一个新的Tensor,这个Tensor具有之前相同的元素,并且共享内存,只不过shape不同。
>>> x = torch.randn(4, 4)
>>> x.size()
torch.Size([4, 4])
>>> y = x.view(16)
>>> y.size()
torch.Size([16])
>>> z = x.view(-1, 8) # the size -1 is inferred from other dimensions
>>> z.size()
torch.Size([2, 8])
>>> a = torch.randn(1, 2, 3, 4)
>>> a.size()
torch.Size([1, 2, 3, 4])
>>> b = a.transpose(1, 2) # Swaps 2nd and 3rd dimension
>>> b.size()
torch.Size([1, 3, 2, 4])
>>> c = a.view(1, 3, 2, 4) # Does not change tensor layout in memory (即c就是a,共享内存,c是a的一个视图(view))
>>> c.size()
torch.Size([1, 3, 2, 4])
>>> torch.equal(b, c)
特点是内存要连续,即contiguous,不连续就使用view会报错,如下
# 1, 直接view
>>> x
tensor([[-1.4160, 0.7536, -1.0142, -0.9624],
[-0.8986, -0.8054, -0.2295, 1.7600],
[-0.8176, -1.9227, 2.0014, 0.9297],
[-1.0379, -0.0025, 0.4301, 1.7012]])
>>> x.size()
torch.Size([4, 4])
>>> x.view(16)
tensor([-1.4160, 0.7536, -1.0142, -0.9624, -0.8986, -0.8054, -0.2295, 1.7600,
-0.8176, -1.9227, 2.0014, 0.9297, -1.0379, -0.0025, 0.4301, 1.7012])
# 2, transpose、permute操作后,内存不连续了。
>>> x.transpose(0, 1).view(16) # transpose不是原地操作,不会改变x本身
Traceback (most recent call last):
File "<input>", line 1, in <module>
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
# 3, contiguous, 不连续,则会复制一份连续的返回,否则返回本身
>>> x.size()
torch.Size([4, 4])
>>> x.transpose(0, 1).contiguous().view(16) # 复制一份连续的返回
tensor([-1.4160, -0.8986, -0.8176, -1.0379, 0.7536, -0.8054, -1.9227, -0.0025,
2.contiguous(存在复制的可能)
返回一个地址连续的Tensor,如果原始Tensor在内存已经是连续的(概率很小,所以凡是用了transpose、permute等变换维度操作,后面如果要view调整shape, 必然要先contiguous),则返回原始Tensor本身。
3.reshape (存在复制的可能)
功能和view类似,也是返回之前相同的元素,只不过shape不同。
不同的地方是:
reshape == contiguous + view
>>> x.size()
torch.Size([4, 4])
# 1, reshape
>>> x.transpose(0, 1).view(16)
Traceback (most recent call last):
File "<input>", line 1, in <module>
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
>>> x.transpose(0, 1).reshape(16)
tensor([-1.4160, -0.8986, -0.8176, -1.0379, 0.7536, -0.8054, -1.9227, -0.0025,
-1.0142, -0.2295, 2.0014, 0.4301, -0.9624, 1.7600, 0.9297, 1.7012])
# 2, contiguous + view
>>> x.transpose(0, 1).contiguous().view(16)
tensor([-1.4160, -0.8986, -0.8176, -1.0379, 0.7536, -0.8054, -1.9227, -0.0025,
-1.0142, -0.2295, 2.0014, 0.4301, -0.9624, 1.7600, 0.9297, 1.7012])
4.permute 和 transpose
permute和transpose都是返回视图,并且该视图重现排序了,与原始Tensor共享内存。
不同的点是多次变换permute效率高点,之变换一次维度,transpose效率高点。
# 1,
>>> x.size()
torch.Size([2, 3, 5])
# 2,交换第二维度和第三维度
>>> x.permute(0, 2, 1).size()
torch.Size([2, 5, 3])
# 3,交换第二维度和第三维度
>>> x.transpose(1, 2).size()
torch.Size([2, 5, 3])
# 4,注意2和1的位置
>>> x.transpose(2, 1).size()
torch.Size([2, 5, 3])
注意:numpy的transpose label.transpose(0, 3, 1, 2) # [b, h, w, c] -> [b, c, h, w]
5. YOLO目标检测应用
网络输出维度:[b, c, h, w],对网络输出进行解码,其中部分过程如下:
[b, c, h, w] -> [b, num_anchor, (5+num_class), h, w] -> [b, num_anchor, h, w, (5+num_class)]
output = torch.randn(64, 45, 8, 16)
output.size()
torch.Size([64, 45, 8, 16]) # [b,c,h,w]
output.view(64, 3, 15, 8, 16).permute(0, 1, 3, 4, 2).contiguous().size()
torch.Size([64, 3, 8, 16, 15]) # [b,num_anchor,h,w, (5+num_class)]