波士顿房价多元线性回归模型的Python实现

张向南zhangxn

已于 2023-11-05 14:27:24 修改

阅读量337

点赞数

分类专栏：神经网络学习记录文章标签：线性回归 python 算法

于 2023-11-05 14:26:21 首次发布

本文链接：https://blog.csdn.net/bgshuxuanzxn/article/details/134229652

版权

神经网络学习记录专栏收录该内容

9 篇文章 0 订阅

订阅专栏

利用梯度下降法，对房间数、与市中心距离、是否被河流穿过和当地师生比例四个因素建立四元回归模型。构造矩阵X，行数404（训练集中数据条数），列数5。前四列分别为以上述四个自变量为元素的列向量，第四列为全1列向量（作为偏置中的系数）。构造矩阵 $W_{5*1}$ ，前四个元素为上述四个自变量对应的权重，最后一个元素与1的乘积作为偏置。对于每组数据，利用均方差函数作为损失函数：

$Loss=\frac{1}{2n}\sum (y-\hat{y})^{2}$

其中

$\hat{y}=wx+b$

用Y表示因变量组成的列向量，于是对于上述矩阵，有

$\hat{Y}=XW$

$Loss=\frac{1}{2n}(Y-XW)^{T}(Y-XW)$

$\frac{\partial Loss}{\partial W}=2X^{T}(Y-XW)$

$W^{^{'}}=W-\eta\frac{\partial Loss}{\partial W}$

不过由于tensorflow自带计算各元素幂的函数，所以编程中没有使用第二个方程。

代码如下：

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

def delta(x,w,y):
    trans=tf.transpose(x)
    sec=tf.matmul(x,w)-y
    delt=2*tf.matmul(trans,sec)
    return delt

boston_housing=tf.keras.datasets.boston_housing
(train_x,train_y),(test_x,test_y)=boston_housing.load_data()
train_y=tf.constant(train_y,dtype=tf.float32)
train_y=tf.reshape(train_y,[-1,1])

#训练集数据提取
t_room=tf.constant(train_x[:,5],dtype=tf.float32) #房间数
t_dis=tf.constant(train_x[:,7],dtype=tf.float32) #与市中心的距离
t_chas=tf.constant(train_x[:,3],dtype=tf.float32) #是否被河流穿过
t_ptra=tf.constant(train_x[:,10],dtype=tf.float32) #师生比例
x0=tf.ones(len(t_room),dtype=tf.float32)

#测试集线性归一化
t_room==tf.divide(t_room-tf.reduce_min(t_room), tf.reduce_max(t_room)-tf.reduce_min(t_room))
t_dis=tf.divide(t_dis-tf.reduce_min(t_dis), tf.reduce_max(t_dis)-tf.reduce_min(t_dis))
t_ptra=tf.divide(t_ptra-tf.reduce_min(t_ptra), tf.reduce_max(t_ptra)-tf.reduce_min(t_ptra))

X=tf.stack((t_room,t_dis,t_chas,t_ptra,x0),axis=1)

tf.random.set_seed(2)
W=tf.random.uniform([5,1],20,60)

msc=[]

#超参数控制
n=float(input("输入学习速率（建议小于0.00006）：")) #学习速率
times=int(input("输入迭代次数（建议小于100000）：")) #迭代次数

counter=0
percentage=0
for i in range(times):
    Y=tf.matmul(X,W)
    t=train_y-Y
    sq=tf.square(t)
    Loss=tf.reduce_mean(sq)/2
    msc.append(Loss)
    W=W-n*delta(X,W,train_y)
    counter+=1
    if (counter/times*100)%10==0:
        percentage+=10
        print("proceeding:%%%d" %(percentage))
print("finished. Graph is loading...")

#测试集数据提取
test_r=tf.constant(test_x[:,5],dtype=tf.float32)
test_d=tf.constant(test_x[:,7],dtype=tf.float32)
test_c=tf.constant(test_x[:,3],dtype=tf.float32)
test_p=tf.constant(test_x[:,10],dtype=tf.float32)
x0=tf.ones(len(test_r),dtype=tf.float32)

#测试集线性归一化
test_r==tf.divide(test_r-tf.reduce_min(test_r), tf.reduce_max(test_r)-tf.reduce_min(test_r))
test_d=tf.divide(test_d-tf.reduce_min(test_d), tf.reduce_max(test_d)-tf.reduce_min(test_d))
test_p=tf.divide(test_p-tf.reduce_min(test_p), tf.reduce_max(test_p)-tf.reduce_min(test_p))

Xt=tf.stack((test_r,test_d,test_c,test_p,x0),axis=1)
Yt=tf.matmul(Xt,W)

plt.figure()
plt.rcParams["font.sans-serif"]=["SimHei"]
plt.subplot(2,1,1)
plt.plot(tf.reshape(msc,[-1]))
plt.ylim(10,40)
plt.xlabel("迭代次数")
plt.ylabel("Loss")
plt.title("损失函数下降曲线")

plt.subplot(2,1,2)
plt.plot(test_y,label="真实数据")
plt.plot(Yt,label="预测数据")
plt.xlabel("数据编号")
plt.ylabel("房价")
plt.title("测试结果")

plt.tight_layout(rect=[0,0,0.95,0.95])

plt.suptitle("四元回归房价预测模型")

plt.legend()
plt.show()

输入学习速率0.00006，迭代次数30000，得到如下结果：

张向南zhangxn

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
1
评论
波士顿房价多元线性回归模型的Python实现

利用梯度下降法，对房间数、与市中心距离、是否被河流穿过和当地师生比例四个因素建立四元回归模型。构造矩阵X，行数404（训练集中数据条数），列数5。前四列分别为以上述四个自变量为元素的列向量，第四列为全1列向量（作为偏置中的系数）。，前四个元素为上述四个自变量对应的权重，最后一个元素与1的乘积作为偏置。不过由于tensorflow自带计算各元素幂的函数，所以编程中没有使用第二个方程。用Y表示因变量组成的列向量，于是对于上述矩阵，有。
复制链接

扫一扫