线性回归解析解推导以及实现

公式1:

w 1 ∗ = ∑ i y i ( x i − 1 n ∑ i x i ) ∑ i x i 2 − 1 n ( ∑ i x i ) 2 w_1^*=\frac{\sum_i y_i(x_i-\frac{1}{n}\sum_i x_i)}{\sum_i x_i^2-\frac{1}{n}(\sum_ix_i)^2} w1=ixi2n1(ixi)2iyi(xin1ixi)
推导上式,给定以下条件:
{ w 0 ∗ = ( 1 n ∑ i y i ) − w 1 ∗ ( 1 n ∑ i x i ) w 1 ∗ = − ∑ i x i ( w 0 ∗ − y i ) / ∑ i x i 2 \begin{cases} w_0^*=(\frac{1}{n}\sum_iy_i)-w_1^*(\frac{1}{n}\sum_ix_i)\\ w_1^*=-\sum_ix_i(w_0^*-y_i)/\sum_i x_i^2 \end{cases} {w0=(n1iyi)w1(n1ixi)w1=ixi(w0yi)/ixi2
这里相当于求解方程组,消去 w 0 ∗ w_0^* w0 即可
w 1 ∗ = − w 0 ∗ ∑ i x i + ∑ i x i y i ∑ i x i 2 (1) w_1^*=\frac{-w_0^*\sum_ix_i+\sum_ix_iy_i}{\sum_ix_i^2} \tag{1} w1=ixi2w0ixi+ixiyi(1)

( ∑ i x i ∑ i x i 2 ) ∗ w 0 ∗ = ( ∑ i x i ∑ i x i 2 ) ∗ ( ( 1 n ∑ i y i ) − w 1 ∗ ( 1 n ∑ i x i ) ) ( ∑ i x i ∑ i x i 2 ) ∗ w 0 ∗ = ( ∑ i x i ) ∗ ( 1 n ∑ i y i ) ∑ i x i 2 − ( ∑ i x i ) ∗ ( − w 1 ∗ ( 1 n ∑ i x i ) ) ∑ i x i 2 (2) \begin{aligned} & \left(\frac{\sum_ix_i}{\sum_ix_i^2}\right)*w_0^*= \left(\frac{\sum_ix_i}{\sum_ix_i^2}\right)* \left((\frac{1}{n}\sum_iy_i)-w_1^*(\frac{1}{n}\sum_ix_i)\right) \\ & \left(\frac{\sum_ix_i}{\sum_ix_i^2}\right)*w_0^*= \frac{\left(\sum_ix_i\right)*\left(\frac{1}{n}\sum_iy_i\right)}{\sum_ix_i^2}- \frac{\left(\sum_ix_i\right)*\left(-w_1^*(\frac{1}{n}\sum_ix_i)\right)}{\sum_ix_i^2} \tag{2} \end{aligned} (ixi2ixi)w0=(ixi2ixi)((n1iyi)w1(n1ixi))(ixi2ixi)w0=ixi2(ixi)(n1iyi)ixi2(ixi)(w1(n1ixi))(2)


( 1 ) + ( 2 ) ⟹ w 1 ∗ − w 1 ∗ ( 1 n ( ∑ i x i ) 2 ) ∑ i x i 2 + 1 n ∑ i x i ∑ i y i ∑ i x i 2 = ∑ i x i y i ∑ i x i 2 ⟹ w 1 ∗ ( ∑ i x i 2 − 1 n ( ∑ i x i ) 2 ∑ i x i 2 ) = ∑ i x i y i − 1 n ∑ i x i ∑ i y i ∑ i x i 2 ⟹ w 1 ∗ = ∑ i y i ( x i − 1 n ∑ i x i ) ∑ i x i 2 − 1 n ( ∑ i x i ) 2 = ∑ i x i ( y i − 1 n ∑ i y i ) ∑ i x i 2 − 1 n ( ∑ i x i ) 2 \begin{aligned} (1)+(2) & \Longrightarrow w_1^*-\frac{w_1^*\left(\frac{1}{n}\left(\sum_ix_i\right)^2\right)}{\sum_ix_i^2}+\frac{\frac{1}{n}\sum_ix_i\sum_iy_i}{\sum_ix_i^2}=\frac{\sum_ix_iy_i}{\sum_ix_i^2}\\ & \Longrightarrow w_1^*\left(\frac{\sum_ix_i^2-\frac{1}{n}\left(\sum_ix_i\right)^2}{\sum_ix_i^2}\right)=\frac{\sum_ix_iy_i-\frac{1}{n}\sum_ix_i\sum_iy_i}{\sum_ix_i^2}\\ & \Longrightarrow w_1^*=\frac{\sum_i y_i(x_i-\frac{1}{n}\sum_i x_i)}{\sum_i x_i^2-\frac{1}{n}(\sum_ix_i)^2}=\frac{\sum_i x_i(y_i-\frac{1}{n}\sum_i y_i)}{\sum_i x_i^2-\frac{1}{n}(\sum_ix_i)^2} \end{aligned} (1)+(2)w1ixi2w1(n1(ixi)2)+ixi2n1ixiiyi=ixi2ixiyiw1(ixi2ixi2n1(ixi)2)=ixi2ixiyin1ixiiyiw1=ixi2n1(ixi)2iyi(xin1ixi)=ixi2n1(ixi)2ixi(yin1iyi)
推导完毕.

公式2

w ^ = ( X T X ) − 1 X T y \mathrm{\widehat{w}}=(\mathrm{X}^{\mathrm{T}}\mathrm{X})^{-1}\mathrm{X}^{\mathrm{T}}\mathrm{\mathbf{y}} w =(XTX)1XTy
推导上式,给定以下条件:
arg min ⁡ w 0 , w 1 L ( w ^ ) = ∥ Y − X w ^ ∥ 2 \argmin_{w_0,w_1}L(\widehat{w})=\|\mathrm{Y}-\mathrm{X\widehat{w}}\|^2 w0,w1argminL(w )=YXw 2

先化简:
∥ X w ^ − Y ∥ 2 = ( X w ^ − Y ) T ( X w ^ − Y ) = ( w ^ T X T − Y T ) ( X w ^ − Y ) = w ^ T X T X w ^ − w ^ T X T Y − Y T X w ^ + Y T Y \begin{aligned} \|\mathrm{X\widehat{w}}-\mathrm{Y}\|^2 &= (\mathrm{X\widehat{w}}-\mathrm{Y})^{\mathrm{T}}(\mathrm{X\widehat{w}}-\mathrm{Y}) \\&=(\mathrm{\widehat{w}}^{\mathrm{T}}\mathrm{X}^{\mathrm{T}}-\mathrm{Y}^{\mathrm{T}})(\mathrm{X\widehat{w}}-\mathrm{Y}) \\&=\mathrm{\widehat{w}}^{\mathrm{T}}\mathrm{X}^{\mathrm{T}}\mathrm{X\widehat{w}}-\mathrm{\widehat{w}}^{\mathrm{T}}\mathrm{X}^{\mathrm{T}}\mathrm{Y}-\mathrm{Y}^{\mathrm{T}}\mathrm{X\widehat{w}}+\mathrm{Y}^{\mathrm{T}}\mathrm{Y} \end{aligned} Xw Y2=(Xw Y)T(Xw Y)=(w TXTYT)(Xw Y)=w TXTXw w TXTYYTXw +YTY


w ^ \mathrm{\widehat{w}} w 求导使得下式为0:
∂ ( ∥ X w ^ − Y ∥ 2 ) ∂ w ^ = ∂ ( w ^ T X T X w ^ − w ^ T X T Y − Y T X w ^ + Y T Y w ^ ) ∂ w ^ = 0 \begin{aligned} \frac{\partial\left( \|\mathrm{X\widehat{w}}-\mathrm{Y}\|^2\right)}{\partial\mathrm{\widehat{w}}} &=\frac{\partial\left( \mathrm{\widehat{w}}^{\mathrm{T}}\mathrm{X}^{\mathrm{T}}\mathrm{X\widehat{w}}-\mathrm{\widehat{w}}^{\mathrm{T}}\mathrm{X}^{\mathrm{T}}\mathrm{Y}-\mathrm{Y}^{\mathrm{T}}\mathrm{X\widehat{w}}+\mathrm{Y}^{\mathrm{T}}\mathrm{Y}\mathrm{\widehat{w}}\right)}{\partial\mathrm{\widehat{w}}}=0 \end{aligned} w (Xw Y2)=w (w TXTXw w TXTYYTXw +YTYw )=0


以下是矩阵求导公式:
∂ ( w ^ T X T X w ^ ) ∂ w ^ = 2 X T X w ^ \frac{\partial\left(\mathrm{\widehat{w}}^{\mathrm{T}}\mathrm{X}^{\mathrm{T}}\mathrm{X\widehat{w}}\right)}{\partial\mathrm{\widehat{w}}}=2\mathrm{X}^{\mathrm{T}}\mathrm{X\widehat{w}} w (w TXTXw )=2XTXw

∂ ( w ^ T X T Y ) ∂ w ^ = X T Y \frac{\partial\left(\mathrm{\widehat{w}}^{\mathrm{T}}\mathrm{X}^{\mathrm{T}}\mathrm{Y}\right)}{\partial\mathrm{\widehat{w}}}=\mathrm{X}^{\mathrm{T}}\mathrm{Y} w (w TXTY)=XTY

∂ ( Y T X w ^ ) ∂ w ^ = X T Y \frac{\partial\left(\mathrm{Y}^{\mathrm{T}}\mathrm{X}\mathrm{\widehat{w}}\right)}{\partial\mathrm{\widehat{w}}}=\mathrm{X}^{\mathrm{T}}\mathrm{Y} w (YTXw )=XTY

∂ ( Y T Y ) ∂ w ^ = 0 \frac{\partial\left(\mathrm{Y}^{\mathrm{T}}\mathrm{Y}\right)}{\partial\mathrm{\widehat{w}}}=0 w (YTY)=0


推导:
2 X T X w ^ − X T Y − X T Y = 0 ⟹ 2 ( X T X w ^ − X T Y ) = 0 ⟹ X T X w ^ − X T Y = 0 ⟹ ( X T X ) w ^ = X T Y ⟹ w ^ = ( X T X ) − 1 X T Y \begin{aligned} &2\mathrm{X}^{\mathrm{T}}\mathrm{X\widehat{w}}-\mathrm{X}^{\mathrm{T}}\mathrm{Y}-\mathrm{X}^{\mathrm{T}}\mathrm{Y}=0 \\ \Longrightarrow & 2(\mathrm{X}^{\mathrm{T}}\mathrm{X\widehat{w}}-\mathrm{X}^{\mathrm{T}}\mathrm{Y})=0 \\ \Longrightarrow & \mathrm{X}^{\mathrm{T}}\mathrm{X\widehat{w}}-\mathrm{X}^{\mathrm{T}}\mathrm{Y} =0 \\ \Longrightarrow & (\mathrm{X}^{\mathrm{T}}\mathrm{X})\mathrm{\widehat{w}}=\mathrm{X}^{\mathrm{T}}\mathrm{Y} \\ \Longrightarrow & \mathrm{\widehat{w}}=(\mathrm{X}^{\mathrm{T}}\mathrm{X})^{-1}\mathrm{X}^{\mathrm{T}}\mathrm{Y} \end{aligned} 2XTXw XTYXTY=02(XTXw XTY)=0XTXw XTY=0(XTX)w =XTYw =(XTX)1XTY

代码实现(Python)

import numpy as np
import matplotlib.pyplot as plt
import time


def get_fake_data(iter):
    X = np.random.rand(iter) * 20
    noise = np.random.randn(iter)
    y = 0.5 * X + noise
    plt.scatter(X, y)
    return X, y


def equation1(X_train, y_train):
    stat_time = time.time()
    num_instances = X_train.shape[0]
    w1 = np.sum(y_train * (X_train - np.sum(X_train) / num_instances)) / \
         (np.sum(X_train ** 2) - (np.sum(X_train) ** 2 / num_instances))
    w0 = np.sum(y_train) / num_instances - w1 * (np.sum(X_train) / num_instances)
    end_time = time.time()
    W = np.array((w1, w0))
    return W, (end_time - stat_time)


def equation2(X_train, y_train):
    stat_time = time.time()
    ones = np.ones(X_train.shape[0])
    X = np.column_stack((X_train.reshape(X_train.shape[0], 1), ones))
    W = (np.linalg.inv((X.T).dot(X)).dot(X.T)).dot(y_train)
    end_time = time.time()
    return W, (end_time - stat_time)


def equation1_test(X, y):
    print("公式1 : ")
    numInstances = X.shape[0]
    train_test_split = int(numInstances * 0.7)
    X_train, y_train = X[:train_test_split], y[:train_test_split]
    X_test, y_test = X[train_test_split:], y[train_test_split:]
    W, spend_time = equation1(X_train, y_train)

    # 画图
    ones = np.ones(X_test.shape[0])
    X_test = np.column_stack((X_test.reshape(X_test.shape[0], 1), ones))
    y_predict = X_test * W
    plt.plot(X_test, y_predict, color='#3479f7')

    # 输出
    print("权重 : ", W)
    print("运行时间 : ", spend_time)


def equation2_test(X, y):
    print("公式2 : ")
    numInstances = X.shape[0]
    train_test_split = int(numInstances * 0.7)
    X_train, y_train = X[:train_test_split], y[:train_test_split]
    X_test, y_test = X[train_test_split:], y[train_test_split:]
    W, spend_time = equation2(X_train, y_train)

    # 画图
    ones = np.ones(X_test.shape[0])
    X_test = np.column_stack((X_test.reshape(X_test.shape[0], 1), ones))
    y_predict = X_test * W
    plt.plot(X_test, y_predict, color='#9b59b6')

    print("权重 : ", W)
    print("运行时间 : ", spend_time)


if __name__ == '__main__':
    X, y = get_fake_data(100)
    equation1_test(X, y)
    equation2_test(X, y)
    plt.show()

公式1 : 
权重 :  [0.50045109 0.06868346]
运行时间 :  0.0
公式2 : 
权重 :  [0.50045109 0.06868346]
运行时间 :  0.37624287605285645

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值