1.PIL
pytorch中dataloader使用的是PIL读取数据,有用的就是下面几句:
from PIL import Image, ImageOps
def read_image(file_name, format=None):
image = Image.open(file_name)
image = np.asarray(image) #转为numpy的数组,只有这样才能读出这里面的数据
image = Image.fromarray(image) #转PIL,如果要使用PIL的库做图像处理,那么必须转为PIL
return image
以resize为例,看看PIL怎么处理的:
from PIL import Image, ImageOps
import torchvision.transforms.transforms as transforms
def read_image(file_name, , image_height, image_width):
image = Image.open(file_name)
image = np.asarray(image)
image = Image.fromarray(image)
image = transforms.Resize((image_width, image_height), Image.BICUBIC)(image)
image = np.asarray(image)
image = image.astype("float32").transpose(2, 0, 1)[np.newaxis] # (1, 3, h, w)
return image
2.opencv
我们自己做推理的时候,既可以使用PIL又可以使用opencv,但是还是opencv最常用,不论在python中还是c++等.支持更好.
opencv读图,转为为RGB(因为PIL也是RGB):
import cv2
def preprocess(image_path):
original_image = cv2.imread(image_path)
# the model expects RGB inputs
original_image = original_image[:, :, ::-1]
return original_image
与PIL对比,resize opencv怎么做的:
def preprocess(image_path, image_height, image_width):
original_image = cv2.imread(image_path)
# the model expects RGB inputs
original_image = original_image[:, :, ::-1]
# Apply pre-processing to image.
image = cv2.resize(original_image, (image_width, image_height), interpolation=cv2.INTER_CUBIC)
image = image.astype("float32").transpose(2, 0, 1)[np.newaxis] # (1, 3, h, w)
return image
3.总结
以fast-reid为例,两种读取方式分别做训练与推理,虽然这两种读取方式有差异,但是这个差异对最后结果的影响还是较小的.(https://github.com/michuanhaohao/reid-strong-baseline/issues/107).
参考
1.pytorch模型转为caffe模型之后mAP下降10个点