缺失值处理(Imputation)

‘’’
sklearn类提供了缺失值处理的基本策略,比如使用缺失值数值所在行或者列的均值,中位数,众数来替换缺失值,该类也兼容不同额缺失值编码
‘’’

import numpy as np
from sklearn.preprocessing import Imputer

'''
 missing_values : integer or "NaN", optional (default="NaN")
        The placeholder for the missing values. All occurrences of
        `missing_values` will be imputed. For missing values encoded as np.nan,
        use the string value "NaN".

    strategy : string, optional (default="mean")
        The imputation strategy.

        - If "mean", then replace missing values using the mean along
          the axis.
        - If "median", then replace missing values using the median along
          the axis.
        - If "most_frequent", then replace missing using the most frequent
          value along the axis.

    axis : integer, optional (default=0)
        The axis along which to impute.

        - If `axis=0`, then impute along columns.
        - If `axis=1`, then impute along rows.
'''

imp = Imputer(missing_values="NaN",strategy='mean',axis=0)

imp.fit([[1,2],[np.nan,3],[7,6]])

X = [[np.nan, 2], [6, np.nan], [7, 6]]

print(imp.transform(X))


[[4.         2.        ]
 [6.         3.66666667]
 [7.         6.        ]]
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值