多项式回归(polynomial regression)转换为线性回归(linear regression)

一、介绍

一元m次多项式回归方程:


二元二次多项式回归方程:


多元多次的多项式回归方程较复杂,加之实际生产生活中一元m次多项式归回就已经能够解决了,所以略!


对于一元m次多项式回归方程,令:


则该一元m次多项式就转化为m元线性回归方程:


因此,用多元线性函数的回归方法就可解决多项式回归问题!需要指出的是,在多项式回归分析中,检验回归系数是否显著,实质上就是判断自变量xi次方项对因变量y的影响是否显著。


对于二元二次多项式回归方程,令:


则该二元二次多项式函数就转化为五元线性回归方程:



二、一元m次多项式回归的最小二乘解


用矩阵表示他们的关系:


用矩阵符号表示:


此处推导过程忽略(参考线性回归最小二乘解的推导过程,基本过程是对每一个参数求偏导,令偏导 = 0,解联立方程组即可),最小二乘法解:


三、Python环境下利用sklearn库写的简单示例

import numpy as np
from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
# seed
rng = np.random.RandomState(123)
# construct samples. give a x, generate y with noise
def genY(x):
	a0, a1, a2, a3, e = 0.1, -0.02, 0.03, -0.04, 0.05
	yr = a0 + a1*x + a2*(x**2) + a3*(x**3) + e
	y = yr + 0.03*rng.rand(1)
	return y
# plot
plt.figure()
plt.title('polynomial regression(sklearn)')
plt.xlabel('x')
plt.ylabel('y')
plt.grid(True)

x_tup = np.linspace(-1, 1, 30)
y = [genY(a) for a in x_tup]
print y
x = x_tup.reshape(-1,1)
y = np.array(y).reshape(-1,1)
plt.plot(x, y, 'k.')

qf = PolynomialFeatures(degree = 3)
qModel = LinearRegression()
qModel.fit(qf.fit_transform(x), y)
print '----'
print qf.get_params()

xp = np.linspace(-1, 2, 100)
yp = qModel.predict(qf.transform(xp.reshape(-1, 1)))

plt.plot(xp, yp, 'r-')
plt.show()

注释:PolynomialFeatures类的成员函数fit_transform根据自变量元数和指数次数(degree)转换成线性回归中的自变量,然后利用线性回归LinearRegression进行拟合。运行结果如下:



四、除了利用最小二乘直接解出参数的值外,也可以用梯度下降法最小化损失函数来训练出参数的值

#-*- coding:utf-8 -*-
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

rng = np.random.RandomState(123)

def genY(x):
    a0, a1, a2, a3, e = 0.1, -0.02, 0.03, -0.04, 0.05
    yr = a0 + a1*x + a2*(x**2) + a3*(x**3) + e
    y = yr + 0.03*rng.rand(1)
    return y
    
plt.figure()
plt.title('polynomial regression(tensorflow)')
plt.xlabel('x')
plt.ylabel('y')
plt.grid(True)

x = np.linspace(-1, 1, 30)
y = [genY(a) for a in x]
x = x.reshape(-1,1)
y = np.array(y).reshape(-1,1)
plt.plot(x, y, 'k.')

X = tf.placeholder('float')
Y = tf.placeholder('float')
W = tf.Variable([0.] * 4)
print W

def Model(x, w):
    terms = []
    for i in range(0, 4):
        term = tf.multiply(w[i], tf.pow(x, i))
        terms.append(term)
    rs = tf.add_n(terms)
    return rs
YModel = Model(X, W)
Cost = tf.reduce_sum(tf.square(Y - YModel))
LearnRate = 0.01
train_op = tf.train.GradientDescentOptimizer(LearnRate).minimize(Cost)

with tf.Session() as sess:
	Init = tf.global_variables_initializer()
	sess.run(Init)
	for i in range(0, 100):
		for (_x, _y) in zip(x, y):
			sess.run(train_op, feed_dict = {X: _x, Y: _y})
	print sess.run(W)

	xp = np.linspace(-1, 2, 100)
	yp = 0
	for i in range(0, 4):
		yp += sess.run(W)[i] * np.power(xp, i)

	plt.plot(xp, yp, 'g-')
	plt.show()





阅读更多
换一批

没有更多推荐了,返回首页