preprocess_input(),这是tensorflow下keras自带的类似于一个归一化的函数;
其对传入的图像做了一些意想不到的事情,虽然能够加快图像的处理速度等优点,但是用起来还是不大方便,来回转化不够通用,虽然也有其自带的转化函数,但是总感觉有些问题;
现将此函数源码分析如下:
源码路径可以通过如下方式找到:
1.#在windows下的cmd或者linux下,输入:
python
from tensorflow import keras
print(keras.__path__)
2.#找到keras路径,然后根据引用进一步查找
#这里以inceptionv3为例:
from keras.applications.inception_v3 import preprocess_input
#找到keras下的applications下的inception_v3:
#打开init文件:
from tensorflow.python.keras._impl.keras.applications import InceptionV3
from tensorflow.python.keras._impl.keras.applications.densenet import decode_predictions
from tensorflow.python.keras._impl.keras.applications.inception_v3 import preprocess_input
#根据如上引用找到tensorflow下python下keras下_impl下keras下applications下inception_v3:
#找到preprocess_input()函数:
@tf_export('keras.applications.nasnet.preprocess_input',
'keras.applications.inception_v3.preprocess_input')
def preprocess_input(x):
"""Preprocesses a numpy array encoding a batch of images.
Arguments:
x: a 4D numpy array consists of RGB values within [0, 255].
Returns:
Preprocessed array.
"""
return imagenet_utils.preprocess_input(x, mode='tf')
#可以看到这里默认传入的参数有两个,一个是待处理的数组x,另一个是mode,这里默认复制tf
#然后根据调用找到imagenet_utils,下的对应函数:
def _preprocess_numpy_input(x, data_format, mode):
"""Preprocesses a Numpy array encoding a batch of images.
Arguments:
x: Input array, 3D or 4D.
data_format: Data format of the image array.
mode: One of "caffe", "tf" or "torch".
- caffe: will convert the images from RGB to BGR,
then will zero-center each color channel with
respect to the ImageNet dataset,
without scaling.
- tf: will scale pixels between -1 and 1,
sample-wise.
- torch: will scale pixels between 0 and 1 and then
will normalize each channel with respect to the
ImageNet dataset.
Returns:
Preprocessed Numpy array.
"""
if mode == 'tf':
x /= 127.5
x -= 1.
return x
if mode == 'torch':
x /= 255.
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
else:
if data_format == 'channels_first':
# 'RGB'->'BGR'
if x.ndim == 3:
x = x[::-1, ...]
else:
x = x[:, ::-1, ...]
else:
# 'RGB'->'BGR'
x = x[..., ::-1]
mean = [103.939, 116.779, 123.68]
std = None
# Zero-center by mean pixel
if data_format == 'channels_first':
if x.ndim == 3:
x[0, :, :] -= mean[0]
x[1, :, :] -= mean[1]
x[2, :, :] -= mean[2]
if std is not None:
x[0, :, :] /= std[0]
x[1, :, :] /= std[1]
x[2, :, :] /= std[2]
else:
x[:, 0, :, :] -= mean[0]
x[:, 1, :, :] -= mean[1]
x[:, 2, :, :] -= mean[2]
if std is not None:
x[:, 0, :, :] /= std[0]
x[:, 1, :, :] /= std[1]
x[:, 2, :, :] /= std[2]
else:
x[..., 0] -= mean[0]
x[..., 1] -= mean[1]
x[..., 2] -= mean[2]
if std is not None:
x[..., 0] /= std[0]
x[..., 1] /= std[1]
x[..., 2] /= std[2]
return x
这里面是实现了不同传入数据格式及模式判断,并进行相应处理的;可以看到默认传入的模式tf对应的执行操作如上:
即在原有传入图片数组值(0-255)的基础之上,进行先除以 /127.5,然后减1,最后得到值得范围为(-1,1)