特征工程之特征预处理_0

特征工程系列数据链接:https://pan.baidu.com/s/1ZUwOM206B-YUzaNzebi_cg 提取码:7w86
from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler
import numpy as np
import pandas as pd
np.set_printoptions(suppress=True)

简单小数据

views = pd.DataFrame([1295., 25., 19000., 5., 1., 300.], columns=['views'])
views
views
01295.0
125.0
219000.0
35.0
41.0
5300.0

Standard Scaler x i − μ σ \frac{x_i - \mu}{\sigma} σxiμ

ss = StandardScaler()
views['zscore'] = ss.fit_transform(views[['views']])
views
viewszscore
01295.0-0.307214
125.0-0.489306
219000.02.231317
35.0-0.492173
41.0-0.492747
5300.0-0.449877
vw = np.array(views['views'])
(vw[0] - np.mean(vw)) / np.std(vw)
-0.30721413311687235

Min-Max Scaler x i − m i n ( x ) m a x ( x ) − m i n ( x ) \frac{x_i - min(x)}{max(x) - min(x)} max(x)min(x)ximin(x)

mms = MinMaxScaler()
views['minmax'] = mms.fit_transform(views[['views']])
views
viewszscoreminmax
01295.0-0.3072140.068109
125.0-0.4893060.001263
219000.02.2313171.000000
35.0-0.4921730.000211
41.0-0.4927470.000000
5300.0-0.4498770.015738
(vw[0] - np.min(vw)) / (np.max(vw) - np.min(vw))
0.068108847834096528

Robust Scaler x i − m e d i a n ( x ) I Q R ( 1 , 3 ) ( x ) \frac{x_i - median(x)}{IQR_{(1,3)}(x)} IQR(1,3)(x)ximedian(x)

rs = RobustScaler()
views['robust'] = rs.fit_transform(views[['views']])
views
viewszscoreminmaxrobust
01295.0-0.3072140.0681091.092883
125.0-0.4893060.001263-0.132690
219000.02.2313171.00000018.178528
35.0-0.4921730.000211-0.151990
41.0-0.4927470.000000-0.155850
5300.0-0.4498770.0157380.132690
quartiles = np.percentile(vw, (25., 75.))
iqr = quartiles[1] - quartiles[0]
(vw[0] - np.median(vw)) / iqr
1.0928829915560916
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值