preprocess_input()源码解析

最新推荐文章于 2024-08-07 13:37:29 发布

Enjoy_endless

最新推荐文章于 2024-08-07 13:37:29 发布

阅读量1.5w

点赞数 6

分类专栏： Machine learning Deep learning

本文链接：https://blog.csdn.net/Enjoy_endless/article/details/101304388

版权

Deep learning 同时被 2 个专栏收录

45 篇文章 5 订阅

订阅专栏

Machine learning

42 篇文章 2 订阅

订阅专栏

preprocess_input(),这是tensorflow下keras自带的类似于一个归一化的函数；

其对传入的图像做了一些意想不到的事情，虽然能够加快图像的处理速度等优点，但是用起来还是不大方便，来回转化不够通用，虽然也有其自带的转化函数，但是总感觉有些问题；

现将此函数源码分析如下：

源码路径可以通过如下方式找到：

1.#在windows下的cmd或者linux下，输入：

python
from tensorflow import keras
print(keras.__path__)

2.#找到keras路径，然后根据引用进一步查找
#这里以inceptionv3为例：

from keras.applications.inception_v3 import preprocess_input
#找到keras下的applications下的inception_v3：
#打开init文件：

from tensorflow.python.keras._impl.keras.applications import InceptionV3
from tensorflow.python.keras._impl.keras.applications.densenet import decode_predictions
from tensorflow.python.keras._impl.keras.applications.inception_v3 import preprocess_input

#根据如上引用找到tensorflow下python下keras下_impl下keras下applications下inception_v3：
#找到preprocess_input（）函数：

@tf_export('keras.applications.nasnet.preprocess_input',
           'keras.applications.inception_v3.preprocess_input')
def preprocess_input(x):
  """Preprocesses a numpy array encoding a batch of images.

  Arguments:
      x: a 4D numpy array consists of RGB values within [0, 255].

  Returns:
      Preprocessed array.
  """
  return imagenet_utils.preprocess_input(x, mode='tf')
  
#可以看到这里默认传入的参数有两个，一个是待处理的数组x，另一个是mode，这里默认复制tf
#然后根据调用找到imagenet_utils，下的对应函数：

def _preprocess_numpy_input(x, data_format, mode):
  """Preprocesses a Numpy array encoding a batch of images.

  Arguments:
      x: Input array, 3D or 4D.
      data_format: Data format of the image array.
      mode: One of "caffe", "tf" or "torch".
          - caffe: will convert the images from RGB to BGR,
              then will zero-center each color channel with
              respect to the ImageNet dataset,
              without scaling.
          - tf: will scale pixels between -1 and 1,
              sample-wise.
          - torch: will scale pixels between 0 and 1 and then
              will normalize each channel with respect to the
              ImageNet dataset.

  Returns:
      Preprocessed Numpy array.
  """
  if mode == 'tf':
    x /= 127.5
    x -= 1.
    return x

  if mode == 'torch':
    x /= 255.
    mean = [0.485, 0.456, 0.406]
    std = [0.229, 0.224, 0.225]
  else:
    if data_format == 'channels_first':
      # 'RGB'->'BGR'
      if x.ndim == 3:
        x = x[::-1, ...]
      else:
        x = x[:, ::-1, ...]
    else:
      # 'RGB'->'BGR'
      x = x[..., ::-1]
    mean = [103.939, 116.779, 123.68]
    std = None

  # Zero-center by mean pixel
  if data_format == 'channels_first':
    if x.ndim == 3:
      x[0, :, :] -= mean[0]
      x[1, :, :] -= mean[1]
      x[2, :, :] -= mean[2]
      if std is not None:
        x[0, :, :] /= std[0]
        x[1, :, :] /= std[1]
        x[2, :, :] /= std[2]
    else:
      x[:, 0, :, :] -= mean[0]
      x[:, 1, :, :] -= mean[1]
      x[:, 2, :, :] -= mean[2]
      if std is not None:
        x[:, 0, :, :] /= std[0]
        x[:, 1, :, :] /= std[1]
        x[:, 2, :, :] /= std[2]
  else:
    x[..., 0] -= mean[0]
    x[..., 1] -= mean[1]
    x[..., 2] -= mean[2]
    if std is not None:
      x[..., 0] /= std[0]
      x[..., 1] /= std[1]
      x[..., 2] /= std[2]
  return x
  
这里面是实现了不同传入数据格式及模式判断，并进行相应处理的；可以看到默认传入的模式tf对应的执行操作如上：
即在原有传入图片数组值(0-255)的基础之上，进行先除以 /127.5，然后减1，最后得到值得范围为(-1,1)