一:
二:
大多数机器学习算法中,会选择Standardization来进行特征缩放,但是,Min-Max Scaling也并非会被弃置一地。在数字图像处理中,像素强度通常就会被量化到[0,1]区间,在一般的神经网络算法中,也会要求特征被量化[0,1]区间。
三:代码实现
# ...
def standardize(X):
"""特征标准化处理
Args:
X: 样本集
Returns:
标准后的样本集
"""
m, n = X.shape
# 归一化每一个特征
for j in range(n):
features = X[:,j]
meanVal = features.mean(axis=0)
std = features.std(axis=0)
if std != 0:
X[:, j] = (features-meanVal)/std
else
X[:, j] = 0
return X
def normalize(X):
"""Min-Max normalization sklearn.preprocess 的MaxMinScalar
Args:
X: 样本集
Returns:
归一化后的样本集
"""
m, n = X.shape
# 归一化每一个特征
for j in range(n):
features = X[:,j]
minVal = features.min(axis=0)
maxVal = features.max(axis=0)
diff = maxVal - minVal
if diff != 0:
X[:,j] = (features-minVal)/diff
else:
X[:,j] = 0
return X