Tensorflow2下jupyter的波士顿房价预测（多元回归代码）

Victor__Zhang

已于 2022-05-05 23:07:15 修改

阅读量3.9k

点赞数 4

分类专栏：兴趣分享技术分享文章标签： jupyter python 深度学习多元线性回归‘’

于 2021-12-15 18:56:56 首次发布

本文链接：https://blog.csdn.net/soga235/article/details/121958886

版权

兴趣分享同时被 2 个专栏收录

84 篇文章 7 订阅

订阅专栏

技术分享

65 篇文章 6 订阅

订阅专栏

源数据可以从链接下：

(4条消息) boston_housing_data.csv-讲义文档类资源-CSDN文库https://download.csdn.net/download/soga235/62336856

1.搭建环境

做好相应的环境tf2,在里面作图。

import tensorflow as tf 
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
print("tensorflow version:",tf.__version__)

需要使用的包及环境

%matplotlib inline
import pandas as pd
from sklearn.utils import shuffle 
from sklearn.preprocessing import scale

2准备数据

波士顿房价数据

df = pd.read_csv("boston_housing_data.csv",header=0)
print(df.describe())

需要把数据放到jupyter 的当前文件中

ds=df.values
print(ds.shape)

前12列是特征，最后一列是目标值

x_data =ds[:,:12]# data set for parameter
y_data = ds[:,12]# data target

3.归一化处理

# 归一化处理
for i in range(12):
    x_data[:,i]=(x_data[:,i]-x_data[:,i].min())/(x_data[:,i].max()-x_data[:,i].min())

4.训练集，验证集等划分

#训练集划分
train_num =300
valid_num =100
test_num =len(x_data)-train_num-valid_num

x_train = x_data[:train_num]
y_train = y_data[:train_num]
print(x_train)
print('x_train shape =',x_train.shape)

验证集：

#验证集划分
x_valid = x_data[train_num:train_num+valid_num]
y_valid = y_data[train_num:train_num+valid_num]

测试集

#测试集划分
x_test = x_data[train_num+valid_num:train_num+valid_num+test_num]
y_test = y_data[train_num+valid_num:train_num+valid_num+test_num]

更改数据类型为后面叉乘做准备，因为W，B等数据要保持一致

#更改数据类型为 dtype= tf.float32
x_train = tf.cast(x_train,dtype =tf.float32)
x_valid = tf.cast(x_valid, dtype = tf.float32)
x_test = tf.cast(x_test, dtype = tf.float32)

5.定义待优化变量

def model(x,w,b):
    return tf.matmul(x,w) + b

定义待优化变量

#准备变量
W = tf.Variable(tf.random.normal([12,1],mean=0.0, stddev=1.0, dtype=tf.float32))
# don;t forget the random
B = tf.Variable(tf.zeros(1),dtype = tf.float32)
print(W)
print(B)

6.设置超参数及损失函数

#设置超参数
training_epochs =50
learning_rate =0.001
batch_size = 10  #批量训练一次的样本

损失函数定义：

#采用均方差作为损失函数
def loss(x,y,w,b):
    err =model(x,w,b)-y
    squared_err =tf.square(err)
    return tf.reduce_mean(squared_err)

7.使用最小梯度方法进行求解最小的损失

def grad(x,y,w,b):
    with tf.GradientTape() as tape:
        loss_ =loss(x,y,w,b)
        return tape.gradient(loss_,[w,b])
  #返回梯度向量损失函数的，注意编程时的结构顺序

需要优化方法，调用了tf 中的SGD命令

#选择优化器
optimizer = tf.keras.optimizers.SGD(learning_rate)
# help apply_gradients

8.迭代求解W，B并显示Loss

loss_list_train =[]
loss_list_valid =[]
total_step = int (train_num/batch_size)
for epoch in range(training_epochs):
    for step in range(total_step):
        xs=x_train[step*batch_size:(step+1)*batch_size,:]
        ys=y_train[step*batch_size:(step+1)*batch_size]
        
        grads = grad(xs,ys,W,B) 
        #calculate the stiffness W B
        optimizer.apply_gradients(zip(grads,[W,B]))
    loss_train =loss(x_train,y_train,W,B).numpy()
    loss_valid =loss(x_valid,y_valid,W,B).numpy()
    loss_list_train.append(loss_train)
    loss_list_valid.append(loss_valid)
    print("epoch={:3d},step={:3d},train_loss={:.4f},valid_loss={:.4f}".format(epoch+1,step+1,loss_train,loss_valid))

作图显示结果

# graph
plt.xlabel("Epochs")
plt.ylabel("loss")
plt.plot(loss_list_train,'blue',label="Train_loss")
plt.plot(loss_list_valid,'red',label="Valid_loss")
plt.legend(loc=1)

9.对训练过的参数进行数据测试

# Test the result
test_house_id =np.random.randint(0,test_num)
y=y_test[test_house_id]
y_pred =model(x_test,W,B)[test_house_id]
y_predict =tf.reshape(y_pred,()).numpy()
print("House id",test_house_id,"Actual value",y,"predicted value",y_predict)