缺失值处理（Imputation）

最新推荐文章于 2022-11-01 22:04:38 发布

老三是只猫

最新推荐文章于 2022-11-01 22:04:38 发布

阅读量5k

点赞数

分类专栏： python 机器学习算法

本文链接：https://blog.csdn.net/zhonglongshen/article/details/92693549

版权

python 同时被 2 个专栏收录

142 篇文章 7 订阅

订阅专栏

机器学习算法

25 篇文章 0 订阅

订阅专栏

‘’’
sklearn类提供了缺失值处理的基本策略，比如使用缺失值数值所在行或者列的均值，中位数，众数来替换缺失值，该类也兼容不同额缺失值编码
‘’’

import numpy as np
from sklearn.preprocessing import Imputer

'''
 missing_values : integer or "NaN", optional (default="NaN")
        The placeholder for the missing values. All occurrences of
        `missing_values` will be imputed. For missing values encoded as np.nan,
        use the string value "NaN".

    strategy : string, optional (default="mean")
        The imputation strategy.

        - If "mean", then replace missing values using the mean along
          the axis.
        - If "median", then replace missing values using the median along
          the axis.
        - If "most_frequent", then replace missing using the most frequent
          value along the axis.

    axis : integer, optional (default=0)
        The axis along which to impute.

        - If `axis=0`, then impute along columns.
        - If `axis=1`, then impute along rows.
'''

imp = Imputer(missing_values="NaN",strategy='mean',axis=0)

imp.fit([[1,2],[np.nan,3],[7,6]])

X = [[np.nan, 2], [6, np.nan], [7, 6]]

print(imp.transform(X))


[[4.         2.        ]
 [6.         3.66666667]
 [7.         6.        ]]

老三是只猫

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
缺失值处理（Imputation）

‘’’sklearn类提供了缺失值处理的基本策略，比如使用缺失值数值所在行或者列的均值，中位数，众数来替换缺失值，该类也兼容不同额缺失值编码‘’’import numpy as npfrom sklearn.preprocessing import Imputer''' missing_values : integer or "NaN", optional (default="NaN...
复制链接

扫一扫

专栏目录