Torch学习笔记三

Pixelikes

已于 2023-11-03 15:08:04 修改

阅读量43

点赞数

文章标签：学习笔记人工智能

于 2023-11-03 13:57:44 首次发布

本文链接：https://blog.csdn.net/weixin_51528791/article/details/134200627

版权

Transforms

导入transforms

from torchvision import tensorforms

ToTensor():

首先需要创建实例

tensor_trans = transforms.ToTensor()

然后通过该实例将np或者Image型图片转变为Tensor型

tensor_img = tensor_trans(img)

ToTensor的注释：

"""Convert a PIL Image or ndarray to tensor and scale the values accordingly.

This transform does not support torchscript.

Converts a PIL Image or numpy.ndarray (H x W x C) in the range
[0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
if the PIL Image belongs to one of the modes (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1)
or if the numpy.ndarray has dtype = np.uint8

In the other cases, tensors are returned without scaling.

.. note::
    Because the input image is scaled to [0.0, 1.0], this transformation should not be used when
    transforming target image masks. See the `references`_ for implementing the transforms for image masks.

.. _references: https://github.com/pytorch/vision/tree/main/references/segmentation
"""

其目的是将Image或者numpy的图片转变为Tensor型，其中图片的shape会改变，而且范围会归一化

from PIL import Image
from torchvision import transforms

img_path = "Hello!Pillow.jpg"
img = Image.open(img_path)

tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)

print(type(tensor_img))

Normalize():

将图片归一化，注释如下：

"""Normalize a tensor image with mean and standard deviation.
This transform does not support PIL Image.
Given mean: ``(mean[1],...,mean[n])`` and std: ``(std[1],..,std[n])`` for ``n``
channels, this transform will normalize each channel of the input
``torch.*Tensor`` i.e.,
``output[channel] = (input[channel] - mean[channel]) / std[channel]``

.. note::
    This transform acts out of place, i.e., it does not mutate the input tensor.

Args:
    mean (sequence): Sequence of means for each channel.
    std (sequence): Sequence of standard deviations for each channel.
    inplace(bool,optional): Bool to make this operation in-place.

"""

mean和std为列表类型，分别为不同通道的平均值和方差，归一化方法为

output[channel] = (input[channel] - mean[channel]) / std[channel]

例如某个通道像素的值的范围[0,1]会在mean=0.5和std=0.5下变为[-1,1]

# Normalize:归一化
# output[channel] = (input[channel] - mean[channel]) / std[channel]
# e.g. [0 , 1.0] to [-1 , 1]
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(img_tensor)
writer.add_image("Normalize", img_norm)

Resize():

改变图像尺寸，旧版本输入图像必须是PIL image，新版本图像可以是PIL image也可以是tensor image

Args:
    size (sequence or int): Desired output size. If size is a sequence like
        (h, w), output size will be matched to this. If size is an int,
        smaller edge of the image will be matched to this number.
        i.e, if height > width, then image will be rescaled to
        (size * height / width, size).

Resize()实例化对象时需要传入resize的尺寸，为一个列表或者某个int整数

如果传入的是列表，那么图像将被resize到指定大小

如果传入的是整数，那么height和width较小的会被resize到该大小，但是较长的边是同比例缩放的，这样图片不会变形

# Resize:
trans_resize = transforms.Resize([256, 256])
img_resize = trans_resize(img_tensor)
writer.add_image("Resize", img_resize)

Compose():

compose用于将多个transform的实例结合到一起，输入类型为列表

"""Composes several transforms together. This transform does not support torchscript.
Please, see the note below.

Args:
    transforms (list of ``Transform`` objects): list of transforms to compose.

Example:
    >>> transforms.Compose([
    >>>     transforms.CenterCrop(10),
    >>>     transforms.PILToTensor(),
    >>>     transforms.ConvertImageDtype(torch.float),
    >>> ])

.. note::
    In order to script the transformations, please use ``torch.nn.Sequential`` as below.

    >>> transforms = torch.nn.Sequential(
    >>>     transforms.CenterCrop(10),
    >>>     transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
    >>> )
    >>> scripted_transforms = torch.jit.script(transforms)

    Make sure to use only scriptable transformations, i.e. that work with ``torch.Tensor``, does not require
    `lambda` functions or ``PIL.Image``.

"""

列表内容为transform的对象，也可以是实例化对象

# Compose
trans_compose = transforms.Compose([trans_norm, trans_resize])
img_compose = trans_compose(img_tensor)
writer.add_image("Compose", img_compose)
writer.close()

RandomCrop():

随机裁剪

"""Crop the given image at a random location.
If the image is torch Tensor, it is expected
to have [..., H, W] shape, where ... means an arbitrary number of leading dimensions,
but if non-constant padding is used, the input is expected to have at most 2 leading dimensions

Args:
    size (sequence or int): Desired output size of the crop. If size is an
        int instead of sequence like (h, w), a square crop (size, size) is
        made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0]).
    padding (int or sequence, optional): Optional padding on each border
        of the image. Default is None. If a single int is provided this
        is used to pad all borders. If sequence of length 2 is provided this is the padding
        on left/right and top/bottom respectively. If a sequence of length 4 is provided
        this is the padding for the left, top, right and bottom borders respectively.

        .. note::
            In torchscript mode padding as single int is not supported, use a sequence of
            length 1: ``[padding, ]``.
    pad_if_needed (boolean): It will pad the image if smaller than the
        desired size to avoid raising an exception. Since cropping is done
        after padding, the padding seems to be done at a random offset.
    fill (number or tuple): Pixel fill value for constant fill. Default is 0. If a tuple of
        length 3, it is used to fill R, G, B channels respectively.
        This value is only used when the padding_mode is constant.
        Only number is supported for torch Tensor.
        Only int or tuple value is supported for PIL Image.
    padding_mode (str): Type of padding. Should be: constant, edge, reflect or symmetric.
        Default is constant.

        - constant: pads with a constant value, this value is specified with fill

        - edge: pads with the last value at the edge of the image.
          If input a 5D torch Tensor, the last 3 dimensions will be padded instead of the last 2

        - reflect: pads with reflection of image without repeating the last value on the edge.
          For example, padding [1, 2, 3, 4] with 2 elements on both sides in reflect mode
          will result in [3, 2, 1, 2, 3, 4, 3, 2]

        - symmetric: pads with reflection of image repeating the last value on the edge.
          For example, padding [1, 2, 3, 4] with 2 elements on both sides in symmetric mode
          will result in [2, 1, 1, 2, 3, 4, 4, 3]
"""

# RandomCrop
trans_randomcrop = transforms.RandomCrop(40)
for i in range(10):
    img_randomcrop = trans_randomcrop(img_tensor)
    writer.add_image("RandomCrop", img_randomcrop, i)