多变量线性回归

最新推荐文章于 2022-11-20 20:11:52 发布

Saturday66

最新推荐文章于 2022-11-20 20:11:52 发布

阅读量362

点赞数

分类专栏：机器学习文章标签：机器学习 python

本文链接：https://blog.csdn.net/qq_36247562/article/details/108217076

版权

机器学习专栏收录该内容

4 篇文章 1 订阅

订阅专栏

总结

多变量线性回归与单变量线性回归很类似,这里以两个特征值为例,注意theta初始化时shape(1,3),另外还要进行特征缩放(缩放范围在尽量在(-1,1)之间),让梯度下降尽快的收敛,最简单的方法就是:
在这里插入图片描述
这里对点以及预测结果进行了绘制,对于更多特征值的预测结果很可能没办法绘制出来,这里只是好奇进行了尝试,看看结果如何.

实现

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
'''
读文件数据
'''
path = "../Doc/ex1data2.txt"
data = pd.read_csv(path,names=['Size','Bedroom','Price'])
print("数据:\n{}".format(data.head()))

'''
特征缩放
'''
data = (data - data.mean())/data.std()
print("数据特征缩放后:\n{}".format(data.head()))

'''
数据初始化
'''
data.insert(0,"Ones",1)# 位置,列,value
cols = data.shape[1] #shape = [97,4]
X = data.iloc[:,0:cols-1]
y = data.iloc[:,cols-1:cols]
print("X矩阵:\n{}\ny向量:\n{}".format(X.head(),y.head()))
X= np.matrix(np.array(X.values))
y = np.matrix(np.array(y.values))
theta = np.matrix(np.array([0,0,0]))
print(X.shape,y.shape,theta.shape)

'''
代价函数
'''
def computeCost(X,y,theta):
    inner = np.power(((X * theta.T) - y), 2)
    return np.sum(inner) / (2 * len(X))

'''
批量梯度下降
'''
def gradientDescent(X,y,theta,alpha,iters):#iters梯度下降迭代的次数
    temp = np.matrix(np.zeros(theta.shape))
    parameters = int (theta.ravel().shape[1])
    cost = np.zeros(iters)
    for i in range(iters):
        error = (X * theta.T) - y
        for j in range(parameters):# 0,1,2,...为特征值个数
            term = np.multiply(error,X[:,j])
            temp[0,j] = theta[0,j]-(alpha/len(X))*np.sum(term)
        theta = temp
        cost[i] = computeCost(X,y,theta)
    return theta,cost
'''
主运行
'''
alpha = 0.01
iters = 1000
g,cost = gradientDescent(X,y,theta,alpha,iters)
print(g,computeCost(X,y,g))
'''
3D绘制
'''
fig = plt.figure()
x1 = np.linspace(data.Size.min(),data.Size.max(),100)
y1 = np.linspace(data.Bedroom.min(),data.Bedroom.max(),100)
z1 = g[0,0]+g[0,1]*x1+g[0,2]*y1
ax = fig.add_subplot(111,projection='3d')
ax.scatter(data.Size,data.Bedroom,data.Price)
ax.set_xlabel('Size')
ax.set_ylabel('Bedroom')
ax.set_zlabel('Price')
ax.plot(x1,y1,z1)
plt.show()

plt.plot(np.arange(iters),cost,'b')
plt.xlabel("Itersations")
plt.ylabel("Cost")
plt.title("Cost vs. Training")
plt.show()

数据

2104,3,399900
1600,3,329900
2400,3,369000
1416,2,232000
3000,4,539900
1985,4,299900
1534,3,314900
1427,3,198999
1380,3,212000
1494,3,242500
1940,4,239999
2000,3,347000
1890,3,329999
4478,5,699900
1268,3,259900
2300,4,449900
1320,2,299900
1236,3,199900
2609,4,499998
3031,4,599000
1767,3,252900
1888,2,255000
1604,3,242900
1962,4,259900
3890,3,573900
1100,3,249900
1458,3,464500
2526,3,469000
2200,3,475000
2637,3,299900
1839,2,349900
1000,1,169900
2040,4,314900
3137,3,579900
1811,4,285900
1437,3,249900
1239,3,229900
2132,4,345000
4215,4,549000
2162,4,287000
1664,2,368500
2238,3,329900
2567,4,314000
1200,3,299000
852,2,179900
1852,4,299900
1203,3,239500

结果

数据:
   Size  Bedroom   Price
0  2104        3  399900
1  1600        3  329900
2  2400        3  369000
3  1416        2  232000
4  3000        4  539900
数据特征缩放后:
       Size   Bedroom     Price
0  0.130010 -0.223675  0.475747
1 -0.504190 -0.223675 -0.084074
2  0.502476 -0.223675  0.228626
3 -0.735723 -1.537767 -0.867025
4  1.257476  1.090417  1.595389
X矩阵:
   Ones      Size   Bedroom
0     1  0.130010 -0.223675
1     1 -0.504190 -0.223675
2     1  0.502476 -0.223675
3     1 -0.735723 -1.537767
4     1  1.257476  1.090417
y向量:
      Price
0  0.475747
1 -0.084074
2  0.228626
3 -0.867025
4  1.595389

(47, 3) (47, 1) (1, 3)

[[-1.10910099e-16  8.78503652e-01 -4.69166570e-02]] 0.1307033696077189

在这里插入图片描述

Saturday66

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
多变量线性回归

总结多变量线性回归与单变量线性回归很类似,这里以两个特征值为例,注意theta初始化时为(1,3),另外还要进行特征缩放(缩放范围在尽量在(-1,1)之间),让梯度下降尽快的收敛,最简单的方法就是:这里对点以及预测结果进行了绘制,对于更多特征值的预测结果很可能没办法绘制出来,这里只是好奇进行了尝试,看看结果如何.实现import numpy as npimport pandas as pdimport matplotlib.pyplot as plt'''读文件数据'''path =
复制链接

扫一扫

专栏目录