常见的transforms课堂笔记
课程链接: https://www.bilibili.com/video/BV1hE411t7RN?p=12&vd_source=a16915472897bc5c811d5ff185570c98
python中的__call__怎么用?
class Person: #写一个名为Person的类
def __call__(self, name): #定义一个内置函数
print("__call__ " + "Hello " + name)
def hello(self, name): #定义和上一个功能相同的函数
print("hello " + name)
person = Person()
person("zhangsan") #内置函数直接调用,无需在后面加“点”
person.hello("lisi") #这里需要加“点”调用hello函数
输出结果为:
__call__ Hello zhangsan hello lisi
ToTensor怎么用?
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
writer = SummaryWriter("logs")
img = Image.open("dataset/train/bees_image/16838648_415acd9e3f.jpg")
print(img) #此时的img是PIL类型
# ToTensor的使用
trans_totensor = transforms.ToTensor() #创建一个名叫trans_totensor的工具,它的作用是把PIL Image或numpy.ndarray转换成Tensor类型
img_tensor = trans_totensor(img) #使用工具,使用的对象是img,即把img这张图转换成Tensor类型
writer.add_image("ToTensor", img_tensor)
writer.close()
Nomalize怎么用?
归一化公式*output[channel] = (input[channel] - mean[channel]) / std[channel]*
若使mean=0.5,srd=0.5,则有output=(input-0.5)/0.5=2*input-1
input是[0, 1], 则output是[-1, 1]
示例:
#Nomalize的使用
print(img_tensor[0][0][0])
trans_nomal = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_nomal = trans_nomal(img_tensor)
print(img_nomal[0][0][0])
writer.add_image("Nomalize", img_nomal)
writer.close()
运行结果为:
tensor(0.0980)
tensor(-0.8039)
打开tensorboard查看:
尝试更改参数:
trans_nomal = transforms.Normalize([3, 2, 1], [5, 2, 3])
在tensorboard中查看:
再次更改参数:
trans_nomal = transforms.Normalize([9, 5, 6], [7, 8, 5])
在tensorboard中查看:
Resize怎么用?
# Resize的使用
print(img.size)
trans_resize = transforms.Resize((20, 240)) #把图片变成20*240,h=20,w=240
# img PIL -> resize -> img_resize PIL
img_resize = trans_resize(img)
# img_resize PIL -> totensor -> img_resize tensor
img_resize = trans_totensor(img_resize)
writer.add_image("Resize", img_resize, 0)
print(img_resize)
writer.close()
在tensorboard中查看:
Compose怎么用?
# Compose - resize - 2 的使用
trans_resize_2 = transforms.Resize(512)
# PIL -> PIL -> tensor
trans_comp = transforms.Compose([trans_resize_2, trans_totensor]) #Compose里的参数须是transforms类型。先resize,再totensor,把这俩功能组合了
img_resize_2 = trans_comp(img)
writer.add_image("Resize", img_resize_2, 1)
writer.close()
运行结果是原图片被放大,图片类型变为tensor
RandomCrop怎么用?
# RandomCrop
trans_random = transforms.RandomCrop(100) #随机裁剪为100*100
trans_comp_2 = transforms.Compose([trans_random, trans_totensor])
for i in range(10):
img_crop = trans_comp_2(img)
writer.add_image("RandomCrop", img_crop, i)
trans_random = transforms.RandomCrop((100,200)) #随机裁剪为100*200
trans_comp_2 = transforms.Compose([trans_random, trans_totensor])
for i in range(10):
img_crop = trans_comp_2(img)
writer.add_image("RandomCropHW", img_crop, i)
writer.close()
结果是10个被随意裁切的图
总结
1.多看官方文档 2.注意输入和输出的类型 3.关注该方法需要什么参数 4.想知道返回值是什么怎么办? (1) print() (2) print(type()) (3) debug