sklearn.feature_selection讲解

最新推荐文章于 2023-11-28 23:33:09 发布

SilenceHell

最新推荐文章于 2023-11-28 23:33:09 发布

阅读量1.7k

点赞数

分类专栏：机器学习实战学习笔记

本文链接：https://blog.csdn.net/Du_Shuang/article/details/84338642

版权

机器学习实战学习笔记专栏收录该内容

44 篇文章 3 订阅

订阅专栏

class sklearn.feature_selection.SelectKBest(score_func=, k=10)
作用：Select features according to the k highest scores
选出分数最高的k个特征

Parameters:
score_func : callable
Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues) or a single array with scores. Default is f_classif (see below “See also”). The default function only works with classification tasks.
k : int or “all”, optional, default=10
Number of top features to select. The “all” option bypasses selection, for use in a parameter search.
输出分数最高的K个特征

类方法：
fit(X, y) Run score function on (X, y) and get the appropriate features.
对X，y数据的特征进行评价
fit_transform(X[, y]) Fit to data, then transform it.
只保留数据X的前K个分数最高的特征

examples：

>>> from sklearn.datasets import load_digits
>>> from sklearn.feature_selection import SelectKBest, chi2
>>> X, y = load_digits(return_X_y=True)
>>> X.shape
(1797, 64)
>>> X_new = SelectKBest(chi2, k=20).fit_transform(X, y)
>>> X_new.shape
(1797, 20)