多项式特征生成--PolynomialFeatures类

夺笋123

已于 2022-11-25 09:07:54 修改

阅读量1.6k

点赞数 2

分类专栏： # sklearn机器学习库文章标签：算法 python

于 2022-11-24 22:57:19 首次发布

本文链接：https://blog.csdn.net/m0_54510474/article/details/128027505

版权

sklearn机器学习库专栏收录该内容

20 篇文章 14 订阅

订阅专栏

PolynomialFeatures

sklearn.preprocessing.PolynomialFeatures(degree=2, *, interaction_only=False, include_bias=True, order='C')

生成多项式和交互特征
生成一个新的特征矩阵，由所有的阶小于等于参数degree的多项式特征组合而成

比如，如果输入的样本是二维的数据： $[a, b]$ ，参数degree=2，那么生成的二项式特征为 $1, a, b, a^2, ab, b^2].$
为什么要生成多项式特征？
有些情况下，获取训练数据的代价经常是非常高昂的，而且从已知数据中挖掘出更多特征也不是一件容易得事情，所以我们可以用纯数学的方法来人为的制造一些特征，比如，原来的输入特征只有 $x_1,x_2$ ，其对应的多项式特征有： $x_1,x_2,x_1x_2,x_1^2,x_2^2$

参数

degree

int or tuple (min_degree, max_degree), default=2

数据类型	描述
int	指定了多项式特征的最高阶数
tuple (min_degree, max_degree)	指定多项式特征的阶数范围

interaction_only

bool, default=False
如果为真，只生成交互特征（由不同特征生成的多项式特征,其阶数小于参数degree且不同于输入特征）
假设输入数据有两列特征（x,y）,那么当该参数为True时，多项式特征的生成情况如下

生成情况	描述
生成	$x, y, x y$
不生成	$x^2,y^2$

include_bias

bool, default=True
如果为真，引入一个偏差数据列，其中所有多项式幂都为零

order

{‘C’, ‘F’}, default=’C’
在密集情况下输出数组的顺序，

属性

powers_

ndarray of shape (n_output_features_, n_features_in_)
每个输入数据的指数

n_features_in_

int
拟合过程中的特征数量

feature_names_in_

ndarray of shape (n_features_in_,)
拟合过程中的特征名称

n_output_features_

int
多项式特征数量

方法

fit(X[, y])

计算输出特征的数量

Compute number of output features.

fit_transform(X[, y])

拟合并转化数据

Fit to data, then transform it.

get_feature_names([input_features])

返回数据特征名称（将在sklearn 1.2版本中被弃用）

DEPRECATED: get_feature_names is deprecated in 1.0 and will be removed in 1.2.

get_feature_names_out([input_features])

返回输出特征的名称

Get output feature names for transformation.

get_params([deep])

返回模型参数

Get parameters for this estimator.

set_params(**params)

设置模型参数

Set the parameters of this estimator.

transform(X)

将数据转化为多项式特征

Transform data to polynomial features.

应用示例

import numpy as np
from sklearn.preprocessing import PolynomialFeatures
x=np.array([0,1])
poly=PolynomialFeatures(2)
poly.fit_transform(x)
>>> array([1.,0.,1.,0.,0.,1.])   # 1,a,b,a^2,ab,b^2
poly1=PolynomialFeatures(2,interaction_only=True)
poly1.fit_transform(x)
>>> array([1.,0.,1.,0.])    # 1,a,b,ab