unfold的作用就是手动实现(卷积中)的滑动窗口操作,也就是只有卷,没有积
ret = F.unfold(inp, size)
- inp:输入Tensor,必须是四维的(B, C, H, W)
- size:元组,表示滑动窗口大小
- ret:输出三维Tensor(a, b, c),a表示输入的batch数,b表示滑动窗口的大小,c表示滑动窗口滑动的次数
例如,对于一张1×1×5×5的如下特征图:
[[[[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[ 11, 12, 13, 14, 15],
[ 16, 17, 18, 19, 20],
[ 21, 22, 23, 24, 25]]]]
对其进行2×2的滑动窗口操作,过程是这样的:
1 2 -> 2 3 -> 3 4
6 7 7 8 8 9 ...
编写代码如下:
import torch
from torch.nn import functional as F
x = torch.Tensor([[[[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[ 11, 12, 13, 14, 15],
[ 16, 17, 18, 19, 20],
[ 21, 22, 23, 24, 25]]]])
x = F.unfold(x, (2, 2))
print(x)
print(x.size())
输出如下:
tensor([[[ 1, 2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14, 16, 17, 18, 19],
[ 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17, 18, 19, 20],
[ 6, 7, 8, 9, 11, 12, 13, 14, 16, 17, 18, 19, 21, 22, 23, 24],
[ 7, 8, 9, 10, 12, 13, 14, 15, 17, 18, 19, 20, 22, 23, 24, 25]]])
torch.Size([1, 4, 16])
行数为4,即对应着2×2的滑动窗口大小;而每一列的元素为滑动窗口依次所覆盖的内容,一共滑动了16次,因此有16列。