【Pytorch学习】Transforms

transforms.py相当于一个工具箱,里面有很多工具,比如totensor(将数据转换为tensor类型)、resize等。这个工具箱的输入是图片

一、Transforms的使用

from PIL import Image
from torchvision import transforms

# 绝对路径:/home/xjy/PycharmProjects/pythonProject/dataset/train/ants/0013035.jpg
# 相对路径:dataset/train/ants/0013035.jpg
img_path = "dataset/train/ants/0013035.jpg"
img = Image.open(img_path)
print(img) # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=768x512 at 0x7FE4A77C61D0>

tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)
print(tensor_img) # tensor([[...]])

首先我们需要创建一个具体的工具,如transforms.ToTensor(),然后我们需要去使用这个工具,将输入转换为输出result = tool(input),即上面的tensor_img = tensor_trans(img)

PS. 使用opencv的代码:

import cv2
cv_img = cv2.imread(img_path) # 为ndarray格式

二、TensorBoard显示

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

# 绝对路径:/home/xjy/PycharmProjects/pythonProject/dataset/train/ants/0013035.jpg
# 相对路径:dataset/train/ants/0013035.jpg
img_path = "dataset/train/ants/0013035.jpg"
img = Image.open(img_path)
print(img) # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=768x512 at 0x7FE4A77C61D0>

tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)
print(tensor_img) # tensor([[...]])

writer = SummaryWriter("logs") # save_dir : logs
writer.add_image("Tensor_img", tensor_img)
writer.close()

运行后在终端输入tensorboard --logdir=logs --port=6007即可显示图片

三、常用的Transforms

需要去关注输入、输出、作用。不同的函数会生成不同的数据类型,如

Image.open()PIL
ToTensor()tensor
cv.imread()narrays

1. ToTensor()

首先,回顾一下类的用法:

class Person:
    def __call__(self, name):
        print("__call__" + " Hello " + name)

    def hello(self, name):
        print("hello " + name)

person = Person()
person("Zhangsan") # __call__ Hello Zhangsan
person.hello("lisi") # hello lisi

接着看一下ToTensor()的用法:

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>

writer = SummaryWriter("logs")

trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
writer.add_image("ToTensor", img_tensor)
writer.close()

因为add_image()函数要求输入torch.Tensor, numpy.array, or string/blobname的图片,所以需要先将img转换为tensor类型。

2. ToPILImage()

作用:将tensor或ndarray数据类型转换为PIL image类型

3. Normalize()

作用:归一化一个tensor image,其公式为 result[channel] = (input[channel] - mean[channel]) / std[channel],那么如果input的范围为[0,1],将mean和std均设置为0.5,那么result的范围就会为[-1,1]。

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>

# ToTensor
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
writer.add_image("ToTensor", img_tensor)

# Normalize
# print(img_tensor[0][0][0])
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]) # mean, std
img_norm = trans_norm(img_tensor)
# print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm)

writer.close()

显示的结果如图:

在这里插入图片描述

4. Resize()

作用:将输入的PIL图片resize成给定的尺寸,输出仍为PIL image数据类型

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>

# Resize
print(img.size) # (500, 375)
trans_resize = transforms.Resize((512, 512))
img_resize = trans_resize(img)
print(img_resize) # <PIL.Image.Image image mode=RGB size=512x512 at 0x7FAB69DFBE10>

# 显示
trans_totensor = transforms.ToTensor()
img_resize = trans_totensor(img_resize)
print(img_resize)
writer.add_image("Resize", img_resize)

writer.close()

5. Compose()

作用:transforms.Compose([trans_resize_2, trans_totensor]),其输入为PIL image,输出tensor

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>

# ToTensor
trans_totensor = transforms.ToTensor()

# Compose - resize - 2
trans_resize_2 = transforms.Resize(512) # 等比缩放
trans_compose = transforms.Compose([trans_resize_2, trans_totensor])
img_resize_2 = trans_compose(img)
writer.add_image("Resize", img_resize_2, 1)

writer.close()

6. RandomCrop()

作用:随机裁剪

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>

# ToTensor
trans_totensor = transforms.ToTensor()

# RandomCrop()
trans_random = transforms.RandomCrop(256) 
# trans_random = transforms.RandomCrop((256, 300)) 
trans_compose_2 = transforms.Compose([trans_random, trans_totensor])
for i in range(10):
    img_crop = trans_compose_2(img)
    writer.add_image("RandomCrop", img_crop, i)

writer.close()

在这里插入图片描述

四、总结

  1. 要关注输入和输出类型,多看看官方文档。
  2. 还要关注方法需要什么参数,可以将光标放在函数的括号内,同时按ctrl+p可弹出参数提示。
  3. 不知道返回值类型的时候,可以用print()或者print(type())或者debug获取。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值