岭回归预测波士顿房价

最新推荐文章于 2024-06-12 16:10:44 发布

藤方拓海

最新推荐文章于 2024-06-12 16:10:44 发布

阅读量1.1k

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/qq_39676333/article/details/108074591

版权

机器学习专栏收录该内容

6 篇文章

订阅专栏

岭回归预测boston房价

#岭回归推导
$f(\theta) = 1/2||A\theta-y||_2^2 + \lambda/2 ||\theta||_2^2=1/2(A\theta - y)^T( A \theta - y) +\lambda/2\theta^T\theta$
$\theta^T A^TA\theta - \theta^TAy-y^TA\theta+y^Ty+\lambda/2\theta^T\theta$
上式对 $\theta$ 求导，
$A^TA\theta - A^Ty+\lambda\theta = 0$
求得 $\theta$ 为,
$\theta = (A^TA+\lambda I)^{-1}A^Ty$
其中I为A^TA对应的单位矩阵，学习率\lambda比较小，取0.01左右。

代码部分：

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

#提取数据并进行分割训练集和测试集
house = datasets.load_boston()
x = house.data
y = house.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3)

#根据最小二乘法建立线性回归模型
class LR:
    def fit(self, X, Y):
        X = np.asmatrix(X.copy())
        Y = np.asmatrix(Y).reshape(-1,1) #列向量
        print(np.shape(X)[1])
        self.w = (X.T * X).I * X.T * Y  #调用最小二乘法求得的系数矩阵
    def predict(slef, X):
        X = np.asmatrix(X.copy())
        result = X * slef.w
        return np.asarray(result).ravel()

#岭回归
class ridge_LR:
    def fit(self, X, Y):
        X = np.mat(X.copy())
        Y = np.mat(Y).reshape(-1,1)
        C = np.eye(np.shape(X)[1])
        lam = 0.01
        self.v = (X.T * X + lam * C).I * X.T * Y #C为单位矩阵
    def predict(self, X):
        X = np.mat(X.copy())
        result = X * self.v
        return np.asarray(result).ravel()  

#改变输入矩阵，在最前边增加一列
b = np.ones(len(x_train))
c = np.ones(len(x_test))
x_train = np.insert(x_train, 0, values = b, axis = 1)
x_test = np.insert(x_test, 0, values = c, axis = 1)

#调用线性回归函数
lr = LR()
lr.fit(x_train, y_train)
y_lr_pred = lr.predict(x_test)
#print(y_lr_pred)
#print(lr.w) #系数矩阵
error_2 = mean_squared_error(y_test, y_lr_pred)
#print(error_2)

#调用岭回归函数
ridge_lr = ridge_LR()
ridge_lr.fit(x_train, y_train)
y_ridge_pred = lr.predict(x_test)
print(y_ridge_pred)
print(ridge_lr.v)
error_3 = mean_squared_error(y_test, y_ridge_pred)
print(error_3)