sklearn中内置了一个将一次项x转换成高次项的API,即将简单的y=k1x1 + …转换成立高次项y=k1x+k2x**2 +…的拟合。具体方法参照numpy中的PolynomialFeatures。这个类有三个参数:
1、degree:控制多项式的次数,默认为
2、interaction_only:默认为False,如果改为True,那么就不会有特征和自己结合的项
3、include_bias:默认为True。如果为True的话,那么结果中就会有0次幂项,即全为1这一列,一般我们用False
现讲解具体实现代码
from sklearn.preprocessing import PolynomialFeatures
import numpy as np
x = np.arange(8).reshape(4, 2)
print(x)
poly = PolynomialFeatures(degree=3, include_bias=False, interaction_only=False)
x = poly.fit_transform(x)
print(x)
print(poly.powers_) # 输出转换后每个特征来自原来的特征的幂的乘积
print(poly.n_input_features_) # 输入几个特征
print(poly.n_output_features_) # 转换为几个特征
PolynomianlFeatures可以与Pipeline,StandardScaler,训练模型一块使用。
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
import numpy as np
seed = 1
x = np.arange(8).reshape(4, 2).astype(np.float)
x += np.random.rand(4,2)*0.02
weitht = np.array([1.2, 3.4]).reshape(-1, 1)
y = np.dot(x, weitht)
lr = Pipeline([('ss', StandardScaler()),
('poly', PolynomialFeatures(degree=2, include_bias=False, interaction_only=True)),
('lr', LinearRegression())
])
lr.fit(x, y)
print(y)
#print(lr.coef) # 暂不知道如何输出权重,待解决。。。