Pytorch+Opencv 读取图像归一化与逆归一化

最新推荐文章于 2024-09-15 22:31:42 发布

努力沉淀

最新推荐文章于 2024-09-15 22:31:42 发布

阅读量3.2k

点赞数 1

文章标签： python opencv

本文链接：https://blog.csdn.net/qq_39129717/article/details/124165099

版权

在深度学习工程中，有时需要利用opencv对图像进行读取后传入深度学习模型中

import cv2

img = cv2.imread(filepath, 1)

这里的img是narray格式，opencv以BGR的形式读入，如果我们要传入model中，需要将img转化为RGB的格式并变为tensor

from torchvision import transforms

tfms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.485,0.456, 0.406), (0.229, 0.224, 0.225))
]) 


img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # BGR-->RGB
img = tfms(img) # 转化为tensor 并 归一化

这里的torchvision.transforms.ToTensor() 的官方解释是：

Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]

经过ToTensor(), img 由 (H x W x C) -->(C x H x W), 每个像素值从[0, 255] --> [0.0, 1.0]

torchvision.transforms.Normalize(mean, std[, inplace]) 官方解释：

This transform does not support PIL Image. Given mean: (mean[1],...,mean[n]) and std: (std[1],..,std[n]) for n channels, this transform will normalize each channel of the input torch.*Tensor i.e., output[channel] = (input[channel] - mean[channel]) / std[channel]

transforms.Normalize的对象必须是tensor，因此他一般和ToTensor()连用，因为RGB的数据有三个channel, 所以代码设置了mean=[0.485,0.456, 0.406], std = [0.229, 0.224, 0.225]

这组均值和方差是imagNet数据集计算出的，通常情况下，数据为(people, buildings, animals, varied lighting/angles/backgrounds, etc.) 均可以使用该均值和方差。或者你可以根据自己的数据生成新的均值和方差：Pytorch生成数据集均值和方差_努力沉淀的博客-CSDN博客

img传入model处理之后，如果要保存生成的图像，我们需要对其进行逆归一化

根据公式：output[channel] = (input[channel] - mean[channel]) / std[channel]

逆归一化的过程为：

x = model(img)
x[0] = x[0] * std[0] + mean[0]
x[1] = x[1] * std[1] + mean[1]
x[2] = x[2] * std[2] + mean[2]

img = x.mul(255).byte() # 数值范围为[0.0~1.0]-->[0~255]
img = img.numpy().transpose((1, 2, 0)) # C×H×W --> H×W×C
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
cv2.imwrite('./test.png', img) # 保存