单变量线性回归python手写算法

最新推荐文章于 2020-12-20 14:04:34 发布

大颗白菜

最新推荐文章于 2020-12-20 14:04:34 发布

阅读量362

点赞数 2

分类专栏：单变量线性回归算法文章标签：机器学习

本文链接：https://blog.csdn.net/weixin_46217375/article/details/106735643

版权

单变量线性回归算法专栏收录该内容

1 篇文章 0 订阅

订阅专栏

单变量线性回归算法

这里我们主要用到的是：numpy和matplotlib
numpy：是Python的一种开源的数值计算扩展，针对数组运算提供大量的数学函数库。
matplotlib：主要是用来画图的一个库。
算法中用到的一些数学知识，我不是能详细的讲出来，所以我就不班门弄斧了，数学不好的把预测模型和代价函数两个公式弄清楚，算法原理弄明白就可以了。

这是我的一个数据集：
6.1101,17.592
5.5277,9.1302
8.5186,13.662
7.0032,11.854
5.8598,6.8233
8.3829,11.886
7.4764,4.3483
8.5781,12
6.4862,6.5987
5.0546,3.8166
5.7107,3.2522
14.164,15.505
5.734,3.1551
8.4084,7.2258
5.6407,0.71618
5.3794,3.5129
6.3654,5.3048
5.1301,0.56077
6.4296,3.6518
7.0708,5.3893
6.1891,3.1386
20.27,21.767
5.4901,4.263
6.3261,5.1875
5.5649,3.0825
18.945,22.638
12.828,13.501
10.957,7.0467
13.176,14.692
22.203,24.147
5.2524,-1.22
6.5894,5.9966
9.2482,12.134
5.8918,1.8495
8.2111,6.5426
7.9334,4.5623
8.0959,4.1164
5.6063,3.3928
12.836,10.117
6.3534,5.4974
5.4069,0.55657
6.8825,3.9115
11.708,5.3854
5.7737,2.4406
7.8247,6.7318
7.0931,1.0463
5.0702,5.1337
5.8014,1.844
11.7,8.0043
5.5416,1.0179
7.5402,6.7504
5.3077,1.8396
7.4239,4.2885
7.6031,4.9981
6.3328,1.4233
6.3589,-1.4211
6.2742,2.4756
5.6397,4.6042
9.3102,3.9624
9.4536,5.4141
8.8254,5.1694
5.1793,-0.74279
21.279,17.929
14.908,12.054
18.959,17.054
7.2182,4.8852
8.2951,5.7442
10.236,7.7754
5.4994,1.0173
20.341,20.992
10.136,6.6799
7.3345,4.0259
6.0062,1.2784
7.2259,3.3411
5.0269,-2.6807
6.5479,0.29678
7.5386,3.8845
5.0365,5.7014
10.274,6.7526
5.1077,2.0576
5.7292,0.47953
5.1884,0.20421
6.3557,0.67861
9.7687,7.5435
6.5159,5.3436
8.5172,4.2415
9.1802,6.7981
6.002,0.92695
5.5204,0.152
5.0594,2.8214
5.7077,1.8451
7.6366,4.2959
5.8707,7.2029
5.3054,1.9869
8.2934,0.14454
13.394,9.0551
5.4369,0.61705

#导入我们需要的两个库
import numpy as np
import matplotlib.pyplot as plt

#对数据的提取，前面文件名，后面表示用逗号隔开的
data=np.loadtxt('t1.txt',delimiter=',')
#对数据的切分 x,y
x=data[:,:-1]
y=data[:,-1]

X=np.c_[np.ones(len(x)),x]
Y=np.c_[y]
print(X)
print(Y)

#根据监督学习我们可以构建一个函数表达式h：
#　　hθ(x) = θ0 + θ1x
def hypo(X,theta):
    h=np.dot(X,theta)
    return h

#代价函数，主要是找预测值和真实值之间最小的方差，也就是找最小代价值
def cost(h,Y):
    m,n=X.shape
    j=1.0/(2*m)*np.sum(np.square(h-Y))
    return j

#梯度下降：对前面的预测模型和代价函数进行一个迭代，来寻找最小代价值，让模型更准确
#也可以说成是对模型的一个优化
def gradient(X,Y,item=1500,alpha=0.01):
    m,n=X.shape
    #j_history:保存每一次的代价值
    j_history=np.zeros(item)
    #theta的大小是根据 X 的大小而来的
    #矩阵相乘必须满足前矩阵的列等于后矩阵的行
    theta=np.zeros((n,1))
    for i in range(item):
        h=hypo(X,theta)
        j_history[i]=cost(h,Y)
        deltatheta=1.0/m*np.dot(X.T,(h-Y))
        theta-=deltatheta*alpha
    return j_history,theta
    
if __name__ == '__main__':
    j_history,theta=gradient(X,Y)
    plt.figure('代价曲线')
    plt.plot(j_history,color='r')
    plt.show()
    print(theta)

代价曲线
在这里插入图片描述

h=hypo(X,theta)
plt.figure('预测')
plt.scatter(X[:,-1],Y,c='blue')
plt.plot(X[:,-1],h,'r')
plt.show()

预测线
在这里插入图片描述
单变量线性回归在我看来应该是机器学习中最简单的算法，也是入门级别的，不过现在很少有手写的了，现在基本都是调用sklearn来使用。后续会更新出来。
sklearn实现单变量线性回归

大颗白菜

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
单变量线性回归python手写算法

单变量线性回归算法import numpy as npimport matplotlib.pyplot as pltdata=np.loadtxt('t1.txt',delimiter=',')x=data[:,:-1]y=data[:,-1]X=np.c_[np.ones(len(x)),x]Y=np.c_[y]print(X)print(Y)def hypo(X,theta): h=np.dot(X,theta) return hdef cost(h,Y)
复制链接

扫一扫

专栏目录