progress_input的作用(keras）_keras propress-CSDN博客

本文链接：https://blog.csdn.net/weixin_43848469/article/details/104209038

在复用经典模型时，都会有一段这样的代码

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input
import numpy as np

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

expand_dim

可以看到加载图像时使用了 expand_dims，这一段的作用直观上是为图像增加一个维度，这个维度代表的是图像批次大小也就是图像的个数，

progress_input

函数的作用是使你的图像更加符合模型需要的格式、函数有个参数为 mode、这个参数有三种 mode, 分别是’caffe’,‘tf’,‘torch’

  if mode == 'tf':
        x /= 127.5
        x -= 1.
        return x

    if mode == 'torch':
        x /= 255.
        mean = [0.485, 0.456, 0.406]
        std = [0.229, 0.224, 0.225]
    else:
        if data_format == 'channels_first':
            # 'RGB'->'BGR'
            if x.ndim == 3:
                x = x[::-1, ...]
            else:
                x = x[:, ::-1, ...]
        else:
            # 'RGB'->'BGR'
            x = x[..., ::-1]
        mean = [103.939, 116.779, 123.68]
        std = None

    # Zero-center by mean pixel
    if data_format == 'channels_first':
        if x.ndim == 3:
            x[0, :, :] -= mean[0]
            x[1, :, :] -= mean[1]
            x[2, :, :] -= mean[2]
            if std is not None:
                x[0, :, :] /= std[0]
                x[1, :, :] /= std[1]
                x[2, :, :] /= std[2]
        else:
            x[:, 0, :, :] -= mean[0]
            x[:, 1, :, :] -= mean[1]
            x[:, 2, :, :] -= mean[2]
            if std is not None:
                x[:, 0, :, :] /= std[0]
                x[:, 1, :, :] /= std[1]
                x[:, 2, :, :] /= std[2]
    else:
        x[..., 0] -= mean[0]
        x[..., 1] -= mean[1]
        x[..., 2] -= mean[2]
        if std is not None:
            x[..., 0] /= std[0]
            x[..., 1] /= std[1]
            x[..., 2] /= std[2]
    return x

这是源码中每个模式不同的处理方式，可以看到’caffe’使用的为居中化，而 torch 与’tf’模式均为标准化，不同的在于 torch 操作为 x/255, 最后得到 [0-1] 的图像值，'tf’格式的操作为 x/=127.5，x-=1，最后得到的是 [-1,1] 范围的图像，如果在构建自己模型时，不用此函数按照源码对图像进行处理也是可以的。