torchvision.transforms是专门用来对数据进行相关的处理。我们可完成的操作如下:
- 归一化
- PIL.Image / numpy.ndarray 与Tensor的相互转化
- 对PIL.Image进行裁剪、缩放等操作
通常,在使用torchvision.transforms,我们通常使用transforms.Compose将transforms组合在一起。
PIL.Image/numpy.ndarray与Tensor的相互转换
PIL.Image/numpy.ndarray转化为Tensor,常常用在训练模型阶段的数据读取,而Tensor转化为PIL.Image/numpy.ndarray则用在验证模型阶段的数据输出。
import numpy as np
import cv2
import torch
from torchvision import transforms
img_path = "" # here is your img path
transform = transforms.Compose([
transforms.ToTensor(), # convert range [0, 255] to range [0, 1]
])
# ndarray -> tensor
img = cv2.imread(img_path)
print(type(img)) # <class 'numpy.ndarray'>
print(img.shape) # (300, 300, 3) H, W, C
img1 = transform1(img)
print(type(img1)) # <class 'torch.Tensor'>
print(img1.shape) # torch.Size([3, 300, 300]) C, H, W
print("img max value :",np.max(img), " img1 max value:", torch.max(img1)) # img max value : 255 img1 max value: tensor(1.)
# PIL.Image -> tensor
from PIL import Image
img3 = Image.imread(img_path)
print(type(img3)) #<class 'PIL.Image.Image'>
img4 = transform1(img3)
print(type(img4)) # <class 'torch.Tensor'>
print(img4.shape) # torch.Size([3, 300, 300]) C, H, W
img3.show() # present the img
print("img3 max value :",np.max(img4), " img4 max value:", torch.max(img4)) # img3 max value : 255 img4 max value: tensor(1.)
# tensor -> PIL.Image
transform2 = transforms.Compose([transforms.ToPILImage()])
img5 = transform2(img4)
img5.show()
归一化
归一化对神经网络的训练是非常重要的,那么我们如何归一化到[-1.0, 1.0]呢?只需要将上面的transform1改为如下所示:
transform1 = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean = [0.5, 0.5, 0.5], std = [0.5, 0.5, 0.5])
]
)
transforms.Normalize使用如下公式进行归一化:
channel=(channel-mean)/std
这样一来,我们的数据中的每个值就变成了[-1,1]的数了。
# follow the code above
transform = transforms.Compose([
transforms.ToTensor(),
]
)
transform1 = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean = [0.5, 0.5, 0.5], std = [0.5, 0.5, 0.5])
]
)
img1 = transform(img)
img2 = transform1(img)
print("img1 minimun valus is: ",torch.min(img1)," img2 minimnu valus is: ",torch.min(img2))
# output:
# img1 minimun valus is: tensor(0.) img2 minimnu valus is: tensor(-1.)
PIL.Image的缩放裁剪等操作
使用transforms.RandomCrop来达到目的。
文档中对该函数的其中一句描述如下:Crop the given PIL Image at a random location.
# follow the code above
transform3 = transforms.Compose([
transforms.ToTensor(),
transforms.ToPILImage(),
transforms.RandomCrop((100,100)),
]
)
img6 = transform3(img3)
print(img6.size) # 100 * 100
img3.show()
img6.show()