在复用经典模型时,都会有一段这样的代码
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input
import numpy as np
model = ResNet50(weights='imagenet')
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
expand_dim
可以看到加载图像时使用了 expand_dims,这一段的作用直观上是为图像增加一个维度,这个维度代表的是图像批次大小也就是图像的个数,
progress_input
函数的作用是使你的图像更加符合模型需要的格式、函数有个参数为 mode、这个参数有三种 mode, 分别是’caffe’,‘tf’,‘torch’
if mode == 'tf':
x /= 127.5
x -= 1.
return x
if mode == 'torch':
x /= 255.
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
else:
if data_format == 'channels_first':
# 'RGB'->'BGR'
if x.ndim == 3:
x = x[::-1, ...]
else:
x = x[:, ::-1, ...]
else:
# 'RGB'->'BGR'
x = x[..., ::-1]
mean = [103.939, 116.779, 123.68]
std = None
# Zero-center by mean pixel
if data_format == 'channels_first':
if x.ndim == 3:
x[0, :, :] -= mean[0]
x[1, :, :] -= mean[1]
x[2, :, :] -= mean[2]
if std is not None:
x[0, :, :] /= std[0]
x[1, :, :] /= std[1]
x[2, :, :] /= std[2]
else:
x[:, 0, :, :] -= mean[0]
x[:, 1, :, :] -= mean[1]
x[:, 2, :, :] -= mean[2]
if std is not None:
x[:, 0, :, :] /= std[0]
x[:, 1, :, :] /= std[1]
x[:, 2, :, :] /= std[2]
else:
x[..., 0] -= mean[0]
x[..., 1] -= mean[1]
x[..., 2] -= mean[2]
if std is not None:
x[..., 0] /= std[0]
x[..., 1] /= std[1]
x[..., 2] /= std[2]
return x
这是源码中每个模式不同的处理方式,可以看到’caffe’使用的为居中化,而 torch 与’tf’模式均为标准化,不同的在于 torch 操作为 x/255, 最后得到 [0-1] 的图像值,'tf’格式的操作为 x/=127.5,x-=1,最后得到的是 [-1,1] 范围的图像,如果在构建自己模型时,不用此函数按照源码对图像进行处理也是可以的。