Transforms
导入transforms
from torchvision import tensorforms
ToTensor():
首先需要创建实例
tensor_trans = transforms.ToTensor()
然后通过该实例将np或者Image型图片转变为Tensor型
tensor_img = tensor_trans(img)
ToTensor的注释:
"""Convert a PIL Image or ndarray to tensor and scale the values accordingly. This transform does not support torchscript. Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0] if the PIL Image belongs to one of the modes (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1) or if the numpy.ndarray has dtype = np.uint8 In the other cases, tensors are returned without scaling. .. note:: Because the input image is scaled to [0.0, 1.0], this transformation should not be used when transforming target image masks. See the `references`_ for implementing the transforms for image masks. .. _references: https://github.com/pytorch/vision/tree/main/references/segmentation """
其目的是将Image或者numpy的图片转变为Tensor型,其中图片的shape会改变,而且范围会归一化
from PIL import Image
from torchvision import transforms
img_path = "Hello!Pillow.jpg"
img = Image.open(img_path)
tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)
print(type(tensor_img))
Normalize():
将图片归一化,注释如下:
"""Normalize a tensor image with mean and standard deviation. This transform does not support PIL Image. Given mean: ``(mean[1],...,mean[n])`` and std: ``(std[1],..,std[n])`` for ``n`` channels, this transform will normalize each channel of the input ``torch.*Tensor`` i.e., ``output[channel] = (input[channel] - mean[channel]) / std[channel]`` .. note:: This transform acts out of place, i.e., it does not mutate the input tensor. Args: mean (sequence): Sequence of means for each channel. std (sequence): Sequence of standard deviations for each channel. inplace(bool,optional): Bool to make this operation in-place. """
mean和std为列表类型,分别为不同通道的平均值和方差,归一化方法为
output[channel] = (input[channel] - mean[channel]) / std[channel]
例如某个通道像素的值的范围[0,1]会在mean=0.5和std=0.5下变为[-1,1]
# Normalize:归一化
# output[channel] = (input[channel] - mean[channel]) / std[channel]
# e.g. [0 , 1.0] to [-1 , 1]
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(img_tensor)
writer.add_image("Normalize", img_norm)
Resize():
改变图像尺寸,旧版本输入图像必须是PIL image,新版本图像可以是PIL image也可以是tensor image
Args: size (sequence or int): Desired output size. If size is a sequence like (h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size).
Resize()实例化对象时需要传入resize的尺寸,为一个列表或者某个int整数
如果传入的是列表,那么图像将被resize到指定大小
如果传入的是整数,那么height和width较小的会被resize到该大小,但是较长的边是同比例缩放的,这样图片不会变形
# Resize:
trans_resize = transforms.Resize([256, 256])
img_resize = trans_resize(img_tensor)
writer.add_image("Resize", img_resize)
Compose():
compose用于将多个transform的实例结合到一起,输入类型为列表
"""Composes several transforms together. This transform does not support torchscript. Please, see the note below. Args: transforms (list of ``Transform`` objects): list of transforms to compose. Example: >>> transforms.Compose([ >>> transforms.CenterCrop(10), >>> transforms.PILToTensor(), >>> transforms.ConvertImageDtype(torch.float), >>> ]) .. note:: In order to script the transformations, please use ``torch.nn.Sequential`` as below. >>> transforms = torch.nn.Sequential( >>> transforms.CenterCrop(10), >>> transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)), >>> ) >>> scripted_transforms = torch.jit.script(transforms) Make sure to use only scriptable transformations, i.e. that work with ``torch.Tensor``, does not require `lambda` functions or ``PIL.Image``. """
列表内容为transform的对象,也可以是实例化对象
# Compose
trans_compose = transforms.Compose([trans_norm, trans_resize])
img_compose = trans_compose(img_tensor)
writer.add_image("Compose", img_compose)
writer.close()
RandomCrop():
随机裁剪
"""Crop the given image at a random location. If the image is torch Tensor, it is expected to have [..., H, W] shape, where ... means an arbitrary number of leading dimensions, but if non-constant padding is used, the input is expected to have at most 2 leading dimensions Args: size (sequence or int): Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0]). padding (int or sequence, optional): Optional padding on each border of the image. Default is None. If a single int is provided this is used to pad all borders. If sequence of length 2 is provided this is the padding on left/right and top/bottom respectively. If a sequence of length 4 is provided this is the padding for the left, top, right and bottom borders respectively. .. note:: In torchscript mode padding as single int is not supported, use a sequence of length 1: ``[padding, ]``. pad_if_needed (boolean): It will pad the image if smaller than the desired size to avoid raising an exception. Since cropping is done after padding, the padding seems to be done at a random offset. fill (number or tuple): Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant. Only number is supported for torch Tensor. Only int or tuple value is supported for PIL Image. padding_mode (str): Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant. - constant: pads with a constant value, this value is specified with fill - edge: pads with the last value at the edge of the image. If input a 5D torch Tensor, the last 3 dimensions will be padded instead of the last 2 - reflect: pads with reflection of image without repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in reflect mode will result in [3, 2, 1, 2, 3, 4, 3, 2] - symmetric: pads with reflection of image repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in symmetric mode will result in [2, 1, 1, 2, 3, 4, 4, 3] """
# RandomCrop
trans_randomcrop = transforms.RandomCrop(40)
for i in range(10):
img_randomcrop = trans_randomcrop(img_tensor)
writer.add_image("RandomCrop", img_randomcrop, i)