标准化
R中scale包默认采用的是样本标准差
1.在python中
from sklearn import preprocessing
import numpy as np
x = np.array([1, 2, 3, 6, 3])
# 求方差
x_var = np.var(x)
#2.8
# 求总体标准差
arr_std_1 = np.std(x)
#1.6733200530681511
# 求样本标准差
import numpy as np
arr_std_2 = np.std(x, ddof=1)
#1.8708286933869707
#preprocessing.scale采用的是总体标准差
x_scaled = preprocessing.scale(x)
#array([-1.19522861, -0.5976143 , 0. , 1.79284291, 0. ])
(1-3)/arr_std_1
#-1.1952286093343936
2.在R中
d <- c(1, 2, 3, 6, 3)
scale(d,center = TRUE, scale = TRUE)
# [1,] -1.0690450
# [2,] -0.5345225
# [3,] 0.0000000
# [4,] 1.6035675
# [5,] 0.0000000
# attr(,"scaled:center")
# [1] 3
# attr(,"scaled:scale")
# [1] 1.870829
scale(d,center = FALSE, scale = TRUE)
# [,1]
# [1,] 0.2603778
# [2,] 0.5207556
# [3,] 0.7811335
# [4,] 1.5622669
# [5,] 0.7811335
# attr(,"scaled:scale")
# [1] 3.840573
#scale包默认采用的是样本标准差
(1-3)/1.8708286933869707
#-1.069045