标准差(Standard Deviation)是方差(variance)的算术平方根。
Excel
python
import pandas as pd
data = {'A': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)
print(df[['A']].var(ddof=0))
print("=========================")
print(df[['A']].var())
print("=========================")
print(df[['A']].std(ddof=0))
print("=========================")
print(df[['A']].std())
A 2.0
dtype: float64
=========================
A 2.5
dtype: float64
=========================
A 1.414214
dtype: float64
=========================
A 1.581139
dtype: float64
sklearn数据标准化
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(df)
scaler.mean_ # 3
scaler.var_ # 2, 说明分母取的是n,非n-1
df_scaled = scaler.transform(df)
# array([[-1.41421356], (x - u) / s = (1 - 3) / sqrt(2)
# [-0.70710678],
# [ 0. ],
# [ 0.70710678],
# [ 1.41421356]])