R 标准化方法代码
这篇文章主要对 R 语言中标准化的代码进行了实践,包含手工计算的结果与使用 scale 函数计算的结果。对一个向量而言,标准化的公式为:
x
s
c
a
l
e
d
=
x
−
m
e
a
n
(
x
)
s
d
(
x
)
,
x_{scaled} = \frac{x - mean(x)}{sd(x)}, \\
xscaled=sd(x)x−mean(x),
其中
m
e
a
n
(
x
)
mean(x)
mean(x) 是
x
x
x 的均值,
s
d
(
x
)
sd(x)
sd(x) 是
x
x
x 的标准差。
对向量进行标准化:
X <- c(1, 2, 3, 4, 5, 6)
scale(X); mean(X); sd(X)
# Results:
# [,1]
# [1,] -1.3363062
# [2,] -0.8017837
# [3,] -0.2672612
# [4,] 0.2672612
# [5,] 0.8017837
# [6,] 1.3363062
# attr(,"scaled:center")
# [1] 3.5
# attr(,"scaled:scale")
# [1] 1.870829
# [1] 3.5
# [1] 1.870829
手动计算:
(X - mean(X))/sd(X)
# Results:
# [1] -1.3363062 -0.8017837 -0.2672612 0.2672612 0.8017837 1.3363062
矩阵按列标准化:
X <- matrix(seq(1, 9), 3, 3)
X_scaled <- apply(X, 2, scale)
X; X_scaled
# Results:
# [,1] [,2] [,3]
# [1,] 1 4 7
# [2,] 2 5 8
# [3,] 3 6 9
# [,1] [,2] [,3]
# [1,] -1 -1 -1
# [2,] 0 0 0
# [3,] 1 1 1
手动计算:
X_mean <- apply(X, 2, mean)
X_sd <- apply(X, 2, sd)
X_scaled <- t((t(X) - X_mean) / X_sd)
X_scaled
# Results:
# [,1] [,2] [,3]
# [1,] -1 -1 -1
# [2,] 0 0 0
# [3,] 1 1 1