PyTorch crop images differentiablly

最新推荐文章于 2024-01-18 16:33:22 发布

trium_KW

最新推荐文章于 2024-01-18 16:33:22 发布

阅读量438

点赞数

分类专栏：算法文章标签： pytorch 仿射变换

本文链接：https://blog.csdn.net/trium_KW/article/details/106294346

版权

本文介绍了如何在PyTorch中进行全微分图像裁剪。首先，解释了PyTorch图像坐标系，这是一个左手法则笛卡尔坐标系，坐标归一化到[-1,1]。接着，讨论了仿射变换理论，用于从裁剪后的图像坐标系统映射到原始图像系统。最后，提供了将仿射变换参数化为矩阵Θ的代码实现，以及从裁剪图像坐标找到原始图像坐标的函数。" 112246069,10295287,Contextual Loss在图像风格迁移的应用与理解,"['图像处理', '深度学习', 'CNN', '风格迁移', '损失函数']

摘要由CSDN通过智能技术生成

Intro

PyTorch provides a variety of means to crop images. For example, torchvision.transforms provides several functions to crop PIL images; PyTorch Forum provides an answer of how to crop image in a differentiable way (differentiable with respect to the image). However, sometimes we need a fully differentiable approach for the cropping action itself. How shall we implement that?

Theory: Affine transformation

Before reaching the answer, we need first to learn about the image coordinate system in PyTorch. It is a left-handed Cartesian system origined at the middle of an image. The coordinate has been normalized to range $[- 1, 1]$ , where $(- 1, - 1)$ indicates the top-left corner, and $(1, 1)$ indicates the bottom-right corner, as pointed out by the doc.

Let $(x, y)$ be the top-left corner of the cropped image with respect to the coordinate of the original image; likewise, we denote $(x^{'}, y^{'})$ as the bottom-right corner of the cropped image. It’s clear that $(x, y)$ corresponds to $(- 1, - 1)$ with respect to the cropped image coordinate system, and $(x^{'}, y^{'})$ corresponds to $(1, 1)$ . We’d like a function $f$ that maps from the cropped image system to the original image system for every point in the cropped image. Since only scaling and translation are involved, the function $f$ can be parameterized by an affine transformation matrix $\Theta$ such that