一. min-max标准化(Min-Max Normalization)
也称离差标准化,将数据调整到[0,1],公式:x* = (x - min(x)) / (max(x) - min(x))
代码实现:
import numpy as np
A = np.array([2, 7, 36, 89, 169, 235, 1021])
print('调整前A:\n', A)
gui = (A - np.min(A)) / (np.max(A) - np.min(A))
print('调整后A:\n', gui)
def normalization(arr): //# 调整到0到1
return [(x - np.min(arr)) / (np.max(arr) - np.min(arr)) for x in arr]
gui_2 = normalization(arr=A)
print('函数计算:\n', gui_2)
输出结果:
调整前A:
[2 7 36 89 169 235 1021]
调整后A:
[0. 0.00490677 0.03336605 0.08537782 0.16388616 0.22865554 1.]
函数计算:
[0.0, 0.004906771344455349, 0.033366045142296366, 0.08537782139352307, 0.16388616290480865, 0.22865554465161925, 1.0]
若想将数据调整到[-1,1],则公式为: x* = (x - mean(x)) / (max(x) - min(x))。
代码实现:
def normalization_2(arr): //# 调整到-1到1
return [(x - np.mean(arr)) / (np.max(arr) - np.min(arr)) for x in arr]
gui_21 = normalization_2(arr=A)
print('A调整到-1到1:\n', gui_21)
输出结果:
A调整到-1到1:
[-0.21659890649095753, -0.21169213514650218, -0.18323286134866115, -0.13122108509743446, -0.05271274358614889, 0.012056638160661706, 0.7834010935090424]
二. Z-score标准化
计算公式:x* = (x - mean(x)) / std(x) ,调整后数据符合标准正态分布,均值为0,方差为1。注意:这里并不是讲数据的值调整到[0,1]之间。
import numpy as np
A = [1,2,3,4]
def z_score_std(arr):
return [(x - np.mean(arr)) / np.std(arr) for x in arr]
New = z_score_std(arr=A)
print('调整前A:', A)
print('调整后A:', New)
print('调整后A的均值', np.mean(New), ',调整后A的标准差:', np.std(New))
输出结果:
调整前A: [1, 2, 3, 4]
调整后A: [-1.3416407864998738, -0.4472135954999579, 0.4472135954999579, 1.3416407864998738]
调整后A的均值 0.0 ,调整后A的标准差: 1.0
注意上述方法的不同应用场合