X_new = SelectKBest(chi2, k=2).fit_transform(X, y)
在使用类似SelectKBest的时候,使用fit.transform可以直接得到转换好的数据集
- 对于回归:
f_regression
,mutual_info_regression
- 对于分类:
chi2
,f_classif
,mutual_info_classif
如果你使用的是稀疏的数据 (例如数据可以由稀疏矩阵来表示),
chi2
, mutual_info_regression
, mutual_info_classif
可以处理数据并保持它的稀疏性。
SelectFromModel函数使用
>>> from sklearn.svm import LinearSVC
>>> from sklearn.datasets import load_iris
>>> from sklearn.feature_selection import SelectFromModel
>>> iris = load_iris()
>>> X, y = iris.data, iris.target
>>> X.shape
(150, 4)
>>> lsvc = LinearSVC(C=0.01, penalty="l1", dual=False).fit(X, y)
>>> model = SelectFromModel(lsvc, prefit=True) prefit参数设置为True表示,使用的模型是已经拟合的
>>> X_new = model.transform(X)
>>> X_new.shape
(150, 3)
以下为prefit 参数: bool, default False
Whether a prefit model is expected to be passed into the constructor directly or not. If True, transform
must be called directly and SelectFromModel cannot be used with cross_val_score
, GridSearchCV
and similar utilities that clone the estimator. Otherwise train the model using fit
and then transform
to do feature selection.