batchnorm是一种让神经网络训练更快、更稳定的方法。它计算每个batch的均值和方差,并作归一化将其调整到均值为0方差为1的标准正态分布。
tensorflow中的batchnorm(常用keras中的定义):
keras.layers.BatchNormalization(epsilon=EPS),espilon是一个调整因子。BatchNormalization默认对最后一个维度做归一化。
pytorch中用BatchNorm2d(num_features,eps),对num_features所在的维度做归一化。
1)tensorflow转pytorch
如果我们有一个训练好的tensorflow的keras.layers.BatchNormalization要转换成pytorch对应的模型。
输入的形状为[1,2,3,4],BatchNormalization中的epislon参数对应BatchNorm2d 的eps参数,这里取1e-5。通过代码:
layer_th.weight.data = torch.tensor(layer_tf.gamma.numpy())
layer_th.bias.data = torch.tensor(layer_tf.beta.numpy())
layer_th.running_mean.data = torch.tensor(layer_tf.moving_mean.numpy())
layer_th.running_var.data = torch.tensor(layer_tf.moving_variance.numpy())
把参数从tensorflow的模型转到pytorch的模型。
注意pytorch的输入形状要转置。假如tensorflow的输入形状是BCHW,BatchNormalization默认对W这个维度做归一化。pytorch的BatchNorm2d需要把归一化的维度放到C所在的维度,对应维度索引[0,1,2,3]中的1,所以做了一个转置,输入的形状变为BWCH,把需要归一化的维度调到1。最后预测完后需要通过转置把维度再调整回来。下面是完整的代码:
from torch.nn import BatchNorm2d
import torch
import tensorflow as tf
import numpy as np
from tensorflow import keras
#1)tensorflow 转为 pytorch
B,C,H,W=1,2,3,4
EPS=1e-5
inputs = np.random.rand(B,C,H,W).astype("float32")
#tensorflow BatchNormalization 模型
inputs_tf = tf.convert_to_tensor(inputs)
layer_tf = tf.keras.layers.BatchNormalization(epsilon=EPS)
outputs_tf = layer_tf(inputs_tf, training=False)
outputs_tf = outputs_tf.numpy()
inputs_th = torch.from_numpy(np.transpose(inputs, (0, 3, 1, 2))) #此处要做一个转置才能满足pytorch的BatchNorm2d形状要求
#pytorch BatchNorm2d 模型
layer_th = BatchNorm2d(num_features=W,eps=EPS)
#tensorflow参数转到pytorch
layer_th.weight.data = torch.tensor(layer_tf.gamma.numpy())
layer_th.bias.data = torch.tensor(layer_tf.beta.numpy())
layer_th.running_mean.data = torch.tensor(layer_tf.moving_mean.numpy())
layer_th.running_var.data = torch.tensor(layer_tf.moving_variance.numpy())
layer_th.eval()
with torch.no_grad():
out = layer_th(inputs_th)
outputs_th = np.transpose(out.numpy(), (0, 2, 3, 1)) #此处要再做一个转置才能和tensorflow形状一致
print('tensorflow=>pytorch:',np.allclose(outputs_th, outputs_tf, atol=1e-5))
打印输出:
tensorflow=>pytorch: True
2)pytorch转tensorflow
如果我们有一个训练好的pytorch的BatchNorm2d要转换成tensorflow的keras.layers.BatchNormalization对应的模型。
输入的形状为[1,2,3,4],BatchNormalization中的epislon参数对应BatchNorm2d 的eps参数,这里取1e-5。
通过代码:
with torch.no_grad():
keras_format_weights = [
layer_th.weight.numpy(),
layer_th.bias.numpy(),
layer_th.running_mean.numpy(),
layer_th.running_var.numpy(),
]
获得torchBatchNorm2d 参数,转为tensorflow的格式,便于tensorflow模型调用set_weights设置参数:
model.layers[0].set_weights(keras_format_weights)
下面是完整的代码:
from torch.nn import BatchNorm2d
import torch
import tensorflow as tf
import numpy as np
from tensorflow import keras
B,C,H,W=1,2,3,4
EPS=1e-5
inputs = np.random.rand(B,C,H,W).astype("float32")
#2)pytorch 转为 tensorflow
#pytorch BatchNorm2d 模型
inputs_th = torch.from_numpy(np.transpose(inputs, (0, 3, 1, 2))) #此处要做一个转置才能满足pytorch的BatchNorm2d形状要求
layer_th = BatchNorm2d(num_features=W,eps=EPS)
layer_th.eval()
with torch.no_grad():
out = layer_th(inputs_th)
outputs_th = np.transpose(out.numpy(), (0, 2, 3, 1)) #此处要再做一个转置才能和tensorflow形状一致
#pytorch参数转到tensorflow
with torch.no_grad():
keras_format_weights = [
layer_th.weight.numpy(),
layer_th.bias.numpy(),
layer_th.running_mean.numpy(),
layer_th.running_var.numpy(),
]
#tensorflow BatchNormalization 模型
model = keras.Sequential()
model.add(tf.keras.Input(shape=(C,H,W)))
model.add(tf.keras.layers.BatchNormalization(epsilon=EPS, moving_mean_initializer="random_normal"))
model.layers[0].set_weights(keras_format_weights)
inputs_tf = tf.convert_to_tensor(inputs)
outputs_tf = model.predict(inputs_tf)
print('pytorch=>tensorflow:',np.allclose(outputs_th, outputs_tf, atol=1e-5))
打印输出:
pytorch=>tensorflow: True
至此,你可以在tensorflow和torch之间自由转换batchnorm模型了。