最小-最大规范化(min-max normalization)
from sklearn import preprocessing
import numpy as np
X_train = np.array([[1., -1., 2.],
[2., 0., 0.],
[0., 1., -1.]])
min_max_scaler = preprocessing.MinMaxScaler([0, 1])
X_train_minmax = min_max_scaler.fit_transform(X_train)
print(min_max_scaler.scale_)
print()
print(X_train_minmax)
X_test = np.array([[-2., -1., 4.]])
X_test_minmax = min_max_scaler.transform(X_test)
print()
print(X_test_minmax)
结果:
[ 0.5 0.5 0.33333333]
[[ 0.5 0. 1. ]
[ 1. 0.5 0.33333333]
[ 0. 1. 0. ]]
[[-1. 0. 1.66666667]]
z-score规范化(零-均值规范化)
from sklearn import preprocessing
import numpy as np
X = np.array([[ 1., -1., 2.],
[ 2., 0., 0.],
[ 0., 1., -1.]])
X_scaled = preprocessing.scale(X)
#
# X_scaled
# array([[ 0. ..., -1.22..., 1.33...],
# [ 1.22..., 0. ..., -0.26...],
# [-1.22..., 1.22..., -1.06...]])
# Scaled data has zero mean and unit variance:
# >>>
mean=X_scaled.mean(axis=0)# axis=0,特征;axis=1,sample
print(mean)
print()
# array([ 0., 0., 0.])
std=X_scaled.std(axis=0)
print(std)
# array([ 1., 1., 1.])
scaler = preprocessing.StandardScaler().fit(X)
print("scaler.mean_:",scaler.mean_)
# array([ 1. ..., 0. ..., 0.33...])
print("scaler.std_:",scaler.std_)
# array([ 0.81..., 0.81..., 1.24...])
train=scaler.transform(X)
print("train:",train)
# array([[ 0. ..., -1.22..., 1.33...],
# [ 1.22..., 0. ..., -0.26...],
# [-1.22..., 1.22..., -1.06...]])
# The scaler instance can then be used on new data to transform it the same way it did on the training set:
test=scaler.transform([[-1., 1., 0.]])
print("test:",test)
# array([[-2.44..., 1.22..., -0.26...]])
小数定标规范化(normalization by decimal scaling)