07-数据归一化

数据归一化

解决方法:将所有的数据映射到同一尺度

最值归一化:把所有的数据映射到0-1之间
Xscale = (X - Xmin)/Xmax - Xmin

适用于有明显边界的情况(比如说学生的分数最低值是0分,最大值是100分);受outlier影响较大

### 数据归一化处理

import numpy as np
import matplotlib.pyplot as plt

### 最值归一化 Normalization

x = np.random.randint(0,100,size=100)

x
运行结果:
array([ 8, 57, 61,  4, 46, 86, 41, 68, 51,  0, 49, 91, 15, 74, 96, 50, 70,
       25, 24, 93,  1, 73, 23, 92, 92, 71, 88,  3, 81, 83, 10, 83, 31, 38,
        1, 80, 43, 10, 45, 28, 49, 15, 98,  0,  8,  1, 26, 57, 42, 43, 81,
       81, 97, 21, 54, 61, 30, 87, 69, 30, 55, 82, 52, 67, 33, 14, 61, 89,
       87, 40, 51,  7, 26, 87, 26, 36, 20, 29, 84, 98, 17, 50, 75, 11, 61,
       70, 24, 91, 30,  3, 47,  9, 29, 80, 88, 18, 22, 97, 33, 13])
(x - np.min(x))/(np.max(x) - np.min(x))
运行结果:
array([0.08163265, 0.58163265, 0.62244898, 0.04081633, 0.46938776,
       0.87755102, 0.41836735, 0.69387755, 0.52040816, 0.        ,
       0.5       , 0.92857143, 0.15306122, 0.75510204, 0.97959184,
       0.51020408, 0.71428571, 0.25510204, 0.24489796, 0.94897959,
       0.01020408, 0.74489796, 0.23469388, 0.93877551, 0.93877551,
       0.7244898 , 0.89795918, 0.03061224, 0.82653061, 0.84693878,
       0.10204082, 0.84693878, 0.31632653, 0.3877551 , 0.01020408,
       0.81632653, 0.43877551, 0.10204082, 0.45918367, 0.28571429,
       0.5       , 0.15306122, 1.        , 0.        , 0.08163265,
       0.01020408, 0.26530612, 0.58163265, 0.42857143, 0.43877551,
       0.82653061, 0.82653061, 0.98979592, 0.21428571, 0.55102041,
       0.62244898, 0.30612245, 0.8877551 , 0.70408163, 0.30612245,
       0.56122449, 0.83673469, 0.53061224, 0.68367347, 0.33673469,
       0.14285714, 0.62244898, 0.90816327, 0.8877551 , 0.40816327,
       0.52040816, 0.07142857, 0.26530612, 0.8877551 , 0.26530612,
       0.36734694, 0.20408163, 0.29591837, 0.85714286, 1.        ,
       0.17346939, 0.51020408, 0.76530612, 0.1122449 , 0.62244898,
       0.71428571, 0.24489796, 0.92857143, 0.30612245, 0.03061224,
       0.47959184, 0.09183673, 0.29591837, 0.81632653, 0.89795918,
       0.18367347, 0.2244898 , 0.98979592, 0.33673469, 0.13265306])
X = np.random.randint(0,100,(50,2))

X[:10,:]
运行结果:array([[48, 55],
       [44, 98],
       [51, 35],
       [57, 21],
       [61, 43],
       [46, 54],
       [99, 65],
       [96, 24],
       [82, 57],
       [86, 94]])

X = np.array(X,dtype=float)

X[:10,:]
运行结果:array([[48., 55.],
       [44., 98.],
       [51., 35.],
       [57., 21.],
       [61., 43.],
       [46., 54.],
       [99., 65.],
       [96., 24.],
       [82., 57.],
       [86., 94.]]

X[:,0] = (X[:,0] - np.min(X[:,0])) / (np.max(X[:,0]) - np.min(X[:,0]))

X[:,1] = (X[:,1] - np.min(X[:,1])) / (np.max(X[:,1]) - np.min(X[:,1]))

X[:10,:]
运行结果:array([[0.46875   , 0.55555556],
       [0.42708333, 0.98989899],
       [0.5       , 0.35353535],
       [0.5625    , 0.21212121],
       [0.60416667, 0.43434343],
       [0.44791667, 0.54545455],
       [1.        , 0.65656566],
       [0.96875   , 0.24242424],
       [0.82291667, 0.57575758],
       [0.86458333, 0.94949495]])


plt.scatter(X[:,0],X[:,1])
plt.show()

得到图片:
在这里插入图片描述

np.mean(X[:,0])//X中第0列对应的均值
运行结果:0.5339583333333333

np.std(X[:,0])//X中第0列对应的方差
运行结果:0.29990457104904406

np.mean(X[:,1])//X中第1列对应的均值
运行结果:0.5018181818181818

np.std(X[:,1])//X中第1列对应的方差
运行结果:0.2829035974757983

 

均值方差归一化:把所有的数据归一到均值为0方差为1的分布中
数据分布没有明显的边界;有可能存在极端数据
Xscale = (X-Xmean)/S

### 均值方差归一化 Standardization

X2 = np.random.randint(0,100,(50,2))
X2 = np.array(X2,dtype=float)

X2[:,0] = (X2[:,0] - np.mean(X2[:,0])) / np.std(X2[:,0])

X2[:,1] = (X2[:,1] - np.mean(X2[:,1])) / np.std(X2[:,1])

plt.scatter(X2[:,0],X2[:,1])
plt.show()

绘制的散点图如下:
在这里插入图片描述

np.mean(X2[:,0])
运行结果:-8.43769498715119e-17
np.std(X2[:,0])
运行结果:1.0
np.mean(X2[:,1])
运行结果:-3.9412917374193055e-17
np.std(X2[:,1])
运行结果:1.0
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值