线性回归简介及python代码实现

最新推荐文章于 2024-09-28 17:45:10 发布

hello_gogogo

最新推荐文章于 2024-09-28 17:45:10 发布

阅读量1.3k

点赞数

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/qq_32933503/article/details/78144675

版权

机器学习专栏收录该内容

8 篇文章 2 订阅

订阅专栏

线性回归的模型：

$h(x) = \theta_0 + \theta_1*x_1 + \theta_2*x_2$ ， $h(x) = \sum_{i=0}^n\theta_ix_i = \theta^TX$ , 这里 $x_0 = 1$ 。

代价函数为：

$J(\theta) = \frac{1}{2} (y - h(x))^2$

代价函数的解释：

$Y = \theta^TX + \varepsilon$ , $y^{(i)} = \theta^Tx^{(i)} + \varepsilon^{(i)}$ , 这里，假设 $\varepsilon$ 服从均值为0的高斯分布，即：

$P(\varepsilon _i) = \frac{1}{\sqrt{(2\pi)\delta}}exp( -\frac{ (y^{(i)} - \theta^Tx^{(i)} )^2 }{ 2\delta^2 } )$

$\Rightarrow$ $P(y^{(i)}|(x^{(i)};\theta)) = \frac{1}{\sqrt{(2\pi)\delta}}exp(- \frac{ (y^{(i)} - \theta^Tx^{(i)} )^2 }{ 2\delta^2 } )$ 。

似然函数为：

$L(\theta) = \prod_{i=1}^n P(y^{(i)}|(x^{(i)};\theta))$

对最似然函数求对数：

$l(\theta) = log(L(\theta)) = \sum_{i=1}^n log( \frac{1}{\sqrt{(2\pi)\delta}}exp(- \frac{ (y^{(i)} - \theta^Tx^{(i)} )^2 }{ 2\delta^2 } ) )$

$=nlog(\frac{1}{\sqrt{2\pi}\delta}) - \frac{1}{\delta^2} \frac{1}{2} \sum_{i=0}^n(y^{(i)} - \theta^Tx^{(i)})^2$

当对似然函数取最大值时，既是求 $\frac{1}{2} \sum_{i=1}^n(y^{(i)} - \theta^Tx^{(i)})^2$ 的最小值，即：

$J(\theta) = \frac{1}{2} \sum_{i=1}^n(y^{(i)} - \theta^Tx^{(i)})^2 = \frac{1}{2}(X\theta - Y)^T(X\theta - Y)$ .

最小二乘意义下的参数最优：

$\frac{\partial J(\theta)}{\partial \theta} = X^TX\theta - X^TY$ , 求驻点，

$\Rightarrow \theta = (X^TX)^{-1}X^TY$

当 $X^TX$ 不可逆, 或者为防止过拟合，可引入Lasso 回归或岭回归。

$J(\theta) = \frac{1}{2} \sum_{i=0}^n(y^{(i)} - \theta^Tx^{(i)})^2 + \frac{1}{2}\lambda\sum_{i=1}^m \theta_i^2$

对 $\theta$ 求偏导数；

$\frac{\partial J(\theta)}{\partial \theta} = X^TX\theta - X^TY + \lambda\theta$ ,

$\Rightarrow \theta = (X^TX + \lambda I)^{-1}X^TY$

使用梯度下降法求 $\theta$ :

$\frac{\partial J(\theta)}{\partial \theta} = \frac{1}{2} \sum_{i=1}^n(y^{(i)} - \theta^Tx^{(i)})^2$

$\Rightarrow \frac{\partial J(\theta)}{\partial \theta_j} = (y - h(x))\frac{\partial(y-h(x))}{\theta_j} = (h(x) - y)x_j$

批量梯度下降：( n 是样本数量 )
Repeat until convergence{
$\theta_j = \theta_j + \alpha \sum_{i=1}^n(h(x^{(i)}) - y^{(i)})x_j^{(i)}$
}
随机梯度下降：（m 是参数 $\theta$ 的数量）
Loop {
for i to m
$\theta_j = \theta_j + \alpha(h(x^{(i)} - y^{(i)}) x_j^{(i)})$
}
python代码实现：

#-*- coding:utf-8 -*-
import numpy as np
from numpy import *
import matplotlib as mpl
import matplotlib.pyplot as plt

if __name__=="__main__":
    m = 7                         # m 是样本数
    n = 2                         # n 是特征数目 + 1
    x = mat(ones((2,m)))
    x[1] = [-1,-2,0,0.5,1.2,0.9,-0.3]
    y = mat(zeros((1,m)))
    y = [1.2,0.3,3,2.65,3.1,2.75,1.5]
    a = 0.01
    Lamda = 0.001
    theta = mat(ones((1,n)))
    # 批量梯度下降
    for i in range(1000):
        for j in range(m):
            y_hat = theta * x
            theta[0,1]=theta[0,1] - a*(((y_hat-y).sum())*x[1,j] + Lamda*theta[0,1])
            theta[0,0]=theta[0,0] - a*(((y_hat-y).sum())*x[0,j] + Lamda*theta[0,0])
    # 随机梯度下降
    # for i in range(500):
    #     for j in range(m):
    #         y_hat = theta[0,0]*x[0,j] + theta[0,1]*x[1,j]
    #         theta[0,0] = theta[0,0] - a*(y_hat - y[j])*x[0,j]
    #         theta[0,1] = theta[0,1] - a*(y_hat - y[j]) * x[1,j]
    print theta
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.scatter(x[1], y, s=30, c='red', marker='s')
    xc = arange(-3.0, 3.0, 0.1)              # x 轴的取值范围
    y_hat = theta[0,0] + theta[0,1]*xc        # 由参数计算分解线
    ax.plot(xc, y_hat)
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.show()