数据预处理torchvision.transforms整理

T-SW

已于 2023-03-28 10:55:17 修改

阅读量1.2k

点赞数 1

文章标签：深度学习 pytorch

于 2023-03-27 20:37:01 首次发布

本文链接：https://blog.csdn.net/weixin_50829873/article/details/129792160

版权

torchvision0.13版官方文档链接

集合

import torchvision.transforms as transforms

torchvision.transforms.Compose(transform1,transform2,...)
transforms.RandomApply(torch.nn.ModuleList([transform1,transform2,...]), p=0.3)      # 概率使用

注：transforms预期输入为[…，H，W] ；示例图源自官方文档
i.e.

center_crops = [T.CenterCrop(size=size)(orig_img) for size in (30, 50, 100, orig_img.size)]
plot(center_crops)

1.Resize（尺寸缩放）

transforms.Resize(size[, interpolation, max_size, ...])

参数	描述
size (sequence or int)	`输出大小(h，w)。如果输入是size(int)，则图像的较小边缘与之匹配。例，如果高度(H)>宽度(W)，则图像将被重新缩放为(size*H/W，size)`。
interpolation(InterpolationMode)	`默认为BILINEAR(双线性插值)。如果输入为张量，则仅支持NEAREST、NEAREST_EXACT、BININEAR、BICUBIC。`
max_size(int, optional)	`图像的最大边长（需大于图像的短边）。`
antialias(bool, optional)	`是否应用抗锯齿，仅对bilinear或bicubic模式有用。`

2.RandomCrop（随机裁剪）

transforms.RandomCrop(size[, padding, pad_if_needed, ...])

参数	描述
size(sequence or int)	`输出大小。如果size为int，则生成方形裁剪（size，size）；如果size为sequence且长度为1，则判断输入为（size[0]，size[0]）。`
padding(int or sequence, optional)	`图像边框上填充。如果列表长度为2，则视为左/右和上/下的填充；如果列表长度为，则视为左、上、右、下的填充。注：在torchscript模式下，输入格式为[padding]。`
pad_if_needed(boolean)	`小于所需大小时，填充图像。`
fill(number or tuple)	`填充像素值的常量，默认为0。如果输入长度为3的元组，则分别填充RGB通道。`
padding_mode(str)	`填充类型。可选constant、edge、reflect、symmetric，默认为constant。`

3.RandomResizedCrop（随机裁剪+缩放）

transforms.RandomResizedCrop(size[, scale, ratio, ...])

参数	描述
size(int or sequence)	`输出大小。如果size为int，则生成方形裁剪（size，size）；如果size为sequence且长度为1，则判断输入为（size[0]，size[0]）。`
scale(tuple of python:float)	`指定随机裁剪区域的相对原图比例下限和上限。`
ratio(tuple of python:float)	`裁剪的随机纵横比的下限和上限。`
interpolation(InterpolationMode)	`默认为BILINEAR(双线性插值)。如果输入为张量，则仅支持NEAREST、NEAREST_EXACT、BININEAR、BICUBIC。`
antialias(bool, optional)	`是否应用抗锯齿，仅对bilinear或bicubic模式有用。`

4.CenterCrop（中心裁剪）

若图像小于设定输出尺寸，则用0填充。

transforms.CenterCrop(size)

参数	描述
size(int or sequence)	`输出大小。如果size为int，则生成方形裁剪（size，size）；如果size为sequence且长度为1，则判断输入为（size[0]，size[0]）。`

5.FiveCrop（四角+中心裁剪）

transforms.FiveCrop(size)

参数	描述
size(int or sequence)	`输出大小。如果size为int，则生成方形裁剪（size，size）；如果size为sequence且长度为1，则判断输入为（size[0]，size[0]）。`
scale(tuple of python:float)	`指定随机裁剪区域的相对原图比例下限和上限。`
ratio(tuple of python:float)	`裁剪的随机纵横比的下限和上限。`
interpolation(InterpolationMode)	`默认为BILINEAR(双线性插值)。如果输入为张量，则仅支持NEAREST、NEAREST_EXACT、BININEAR、BICUBIC。`
antialias(bool, optional)	`是否应用抗锯齿，仅对bilinear或bicubic模式有用。`

6.TenCrop（FiveCrop+翻转）

transforms.TenCrop(size, vertical_flip=False)

参数	描述
size(int or sequence)	`输出大小。如果size为int，则生成方形裁剪（size，size）；如果size为sequence且长度为1，则判断输入为（size[0]，size[0]）。`
vertical_flicp(bool)	`垂直翻转，默认水平翻转。`

7.Pad（填充）

transforms.Pad(padding, fill=0, padding_mode='constant')

参数	描述
padding(int or sequence)	`同上（2.RandomCrop）`
fill(number or tuple)	`同上（2.RandomCrop）`
padding_mode(str)	`同上（2.RandomCrop）`

8.RandomRotation（按角度随机旋转）

transforms.RandomRotation(degrees, interpolation=InterpolationMode.NEAREST, expand=False, center=None, fill=0)

参数	描述
degrees(sequence or number)	`角度范围 (min, max)。若输入为数字，则视为 (-degrees, +degrees)。`
interpolation(InterpolationMode)	`插值模式，默认为NEAREST。Tensor输入下还可选用BILINEAR模式。`
expand(bool, optional)	`若为True则输出全部旋转图像，否则输出尺寸与输入相等。`
center(sequence, optional)	`设定旋转中心(x，y)，原点为左上角。默认值为图像中心。`
fill(sequence or number)	`填充像素值的常量，默认为0。如果输入长度为3的元组，则分别填充RGB通道。`

9.RandomAffine（随机仿射变换[中心不变]）

仿射变换:二维的线性变换，由五种基本原子变换构成，分别是旋转、平移、缩放、错切和翻转。

transforms.RandomAffine(degrees, translate=None, scale=None, shear=None, interpolation=InterpolationMode.NEAREST, fill=0, center=None)

参数	描述
degrees(sequence or number)	`角度范围 (min, max)。若输入为数字，则视为 (-degrees, +degrees)。`
translate(tuple, optional)	`平移范围(a, b)。取-img_width * a < dx < img_width * a水平移动，取-img_height * b < dy < img_height * b垂直移动，默认不平移。`
scale(tuple, optional)	`缩放因子间隔(a，b)，从a<=scale<=b中随机采样缩放。默认保持原始缩放。`
shear(sequence or number, optional)	`剪切范围(shear)。若输入为一个数字，则剪切范围为(-shear，+shear)；若是长度为2的列表，则x轴的剪切范围为(shear[0]，shear[1])；若是长度为4的列表，则x轴剪切范围为(shear[0]，shear[1])，y轴剪切范围为(shear[2]，shear[3])。默认不会剪切。`
interpolation(InterpolationMode)	`插值模式，默认为NEAREST。Tensor输入下还可选用BILINEAR模式。`
fill(sequence or number)	`填充像素值的常量，默认为0。如果输入长度为3的元组，则分别填充RGB通道。`
center(sequence, optional)	`设定旋转中心(x，y)，原点为左上角。默认值为图像中心。`

10.RandomPerspective（随机视角）

transforms.RandomPerspective(distortion_scale=0.5, p=0.5, interpolation=InterpolationMode.BILINEAR, fill=0)

参数	描述
distortion_scale(float)	`偏移程度参数，范围从0到1，默认值为0.5。`
p(float)	`图像变换概率，默认值为0.5。`
interpolation(InterpolationMode)	`插值模式，默认为NEAREST。Tensor输入下还可选用BILINEAR模式。`
fill(sequence or number)	`填充像素值的常量，默认为0。如果输入长度为3的元组，则分别填充RGB通道。`

11.ElasticTransform（弹性变换）

transforms.ElasticTransform(alpha=50.0, sigma=5.0, interpolation=InterpolationMode.BILINEAR, fill=0)

参数	描述
alpha(float or sequence of python:floats)	`位移的大小，默认值为50.0。`
sigma(float or sequence of python:floats)	`位移的平滑度，默认值为5.0。`
interpolation(InterpolationMode)	`插值模式，默认为NEAREST。Tensor输入下还可选用BILINEAR模式。`
fill(sequence or number)	`填充像素值的常量，默认为0。如果输入长度为3的元组，则分别填充RGB通道。`

12.RandomHorizontalFlip（随机水平翻转）

transforms.RandomHorizontalFlip(p=0.5)

参数	描述
p(float)	`翻转概率，默认值为0.5。`

13.RandomVerticalFlip（随机垂直翻转）

transforms.RandomVerticalFlip(p=0.5)

参数	描述
p(float)	`翻转概率，默认值为0.5。`

14.RandomAdjustSharpness(随机清晰度)

transforms.RandomAdjustSharpness(sharpness_factor, p=0.5)

参数	描述
sharpness_factor(float)	`清晰度，任意非负数。0输出模糊图像，1输出原始图像，2锐化两倍的原始图像。`
p(float)	`图像锐化概率，默认值为0.5。`

15.RandomErasing（随机擦除）

transforms.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False)

参数	描述
p(float)	`随机擦除的概率，默认值为0.5。`
scale(tuple, optional)	`擦除区域与输入图像的比例范围(a，b)。`
ratio(tuple of python:float)	`擦除区域的随机纵横比的下限和上限。`
value	`默认为 0。如果是单个 int，则用于擦除所有像素。如果是长度为 3 的元组，则分别用于擦除 R、G、B 通道。如果 str 为 ‘random’，则使用随机值擦除每个像素。`
inplace	`make this transform inplace. 默认为False.`

16.Grayscale/RandomGrayscale（转灰度图/随机转）

transforms.Grayscale(num_output_channels=1)
transforms.RandomGrayscale(p=0.1)

参数	描述
num_output_channels(int)	`输出图像的通道数，1或3。`
参数	描述
p(float)	`随机概率，默认值为0.1。`

17.GaussianBlur（随机高斯模糊）

transforms.GaussianBlur(kernel_size, sigma=(0.1, 2.0))

参数	描述
kernel_size(int or sequence)	`高斯核大小。`
sigma(float or tuple of python:float (min, max))	`核的标准差。若输入为float，则固定；若输入为float (min, max)，则随机选择区间内一数值。`