pytorch 中的grid_sample和affine_grid

最新推荐文章于 2024-06-11 20:06:28 发布

weixin_30745641

最新推荐文章于 2024-06-11 20:06:28 发布

阅读量4.2k

点赞数 1

文章标签：人工智能 python

原文链接：http://www.cnblogs.com/zi-wang/p/9950917.html

版权

pytorch 中提供了对Tensor进行Crop的方法，可以使用GPU实现。具体函数是torch.nn.functional.affine_grid和torch.nn.functional.grid_sample。前者用于生成二维网格，后者对输入Tensor按照网格进行双线性采样。

grid_sample函数中将图像坐标归一化到\([-1, 1]\)，其中0对应-1，width-1对应1。

affine_grid的输入是仿射矩阵(Nx2x3)和输出Tensor的尺寸(Tensor.Size(NxHxWx2))，输出的是归一化的二维网格。

在Faster R CNN中，用到了Crop Pooling，需要在feature map 中裁剪出与proposal region 对应的部分，可以使用这两个函数实现。具体参考 http://www.telesens.co/2018/03/11/object-detection-and-classification-using-r-cnns/#ITEM-1455-4

下面进行简单的实验：

首先生成一个1x1x5x5的Tensor变量
裁剪窗口为x1 = 2.5, x2 = 4.5, y1 = 0.5, y2 = 3.5，size为1x1x3x2，根据坐标设置theta矩阵
进行裁剪，并与numpy计算结果相比较。

a = torch.rand((1, 1, 5, 5))
print(a)

# x1 = 2.5, x2 = 4.5, y1 = 0.5, y2 = 3.5
# out_w = 2, out_h = 3
size = torch.Size((1, 1, 3, 2))
print(size)

# theta
theta_np = np.array([[0.5, 0, 0.75], [0, 0.75, 0]]).reshape(1, 2, 3)
theta = torch.from_numpy(theta_np)
print('theta:')
print(theta)
print()

flowfield = torch.nn.functional.affine_grid(theta, size)
sampled_a = torch.nn.functional.grid_sample(a, flowfield.to(torch.float32))
sampled_a = sampled_a.numpy().squeeze()
print('sampled_a:')
print(sampled_a)

# compute bilinear at (0.5, 2.5), using (0, 3), (0, 4), (1, 3), (1, 4)
# quickly compute(https://blog.csdn.net/lxlclzy1130/article/details/50922867)
print()
coeff = np.array([[0.5, 0.5]])
A = a[0, 0, 0:2, 2:2+2]
print('torch sampled at (0.5, 3.5): %.4f' % sampled_a[0,0])
print('numpy compute: %.4f' % np.dot(np.dot(coeff, A), coeff.T).squeeze())

运行结果为：