利用sklearn做特征工程
一:数值型特征
1.1 对数变换(log变换)
import numpy as np
log_age = df_train['Age'].apply(lambda x:np.log(x))
1.2 MinMaxscaler(最大最小值缩放)
from sklearn.preprocessing import MinMaxScaler
minmax = MinMaxScaler()
age_trans = minmax.fit_transform(df_train[['Age']])
age_trans
1.3 StandardScaler(Z-score缩放)
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
age_std = ss.fit_transform(df_train[['Age']])
age_std
1.4 统计特征
df_train[['Age']].min()
df_train[['Age']].max()
df_train[['Age']].median()
df_train[['Age']].mean()
df_train[['Age']].quantile(0.25)