#机器学习--实例--基于梯度优化的线性最小二乘法

#机器学习--实例--基于梯度优化的线性最小二乘法

引言

        本系列博客旨在为机器学习(深度学习)提供数学理论基础。因此内容更为精简,适合二次学习的读者快速学习或查阅。


推导过程

        假设矩阵 A m , n − 1 ′ A'_{m,n-1} Am,n1 为自变量数据矩阵, b m b_{m} bm 为因变量向量,令 A m , n = [ A ′ , 1 ] A_{m,n}=[A',1] Am,n=[A,1],目标是找到一条直线 f ( x ) = A x f(x)=Ax f(x)=Ax 使得值到直线的误差最小,因此我们需要采用梯度下降算法来找到最小化下式的 x x x 的值: f ( x ) = 1 2 ∣ ∣ A x − b ∣ ∣ 2 2 f(x)=\frac{1}{2}||Ax-b||^{2}_{2} f(x)=21∣∣Axb22        本次带领读者完成一次完整的求导过程,之后就直接给结论了,首先将原函数进行展开: f ( x ) = 1 2 [ ( ( a 11 x 1 + ⋯ + a 1 n x n ) − b 1 ) 2 + ⋯ + ( ( a m 1 x 1 + ⋯ + a m n x n ) − b m ) 2 ] f(x)=\frac{1}{2}[((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})^{2} + \dots + ((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})^{2}] f(x)=21[((a11x1++a1nxn)b1)2++((am1x1++amnxn)bm)2]        根据向量微积分 ∇ x f ( x ) = 1 2 ∗ [ ∂ [ ( ( a 11 x 1 + ⋯ + a 1 n x n ) − b 1 ) 2 + ⋯ + ( ( a m 1 x 1 + ⋯ + a m n x n ) − b m ) 2 ] ∂ x 1 ⋮ ∂ [ ( ( a 11 x 1 + ⋯ + a 1 n x n ) − b 1 ) 2 + ⋯ + ( ( a m 1 x 1 + ⋯ + a m n x n ) − b m ) 2 ] ∂ x n ] = [ 2 a 11 ( ( a 11 x 1 + ⋯ + a 1 n x n ) − b 1 ) + ⋯ + 2 a m 1 ( ( a m 1 x 1 + ⋯ + a m n x n ) − b m ) ⋮ 2 a 1 n ( ( a 11 x 1 + ⋯ + a 1 n x n ) − b 1 ) + ⋯ + 2 a m n ( ( a m 1 x 1 + ⋯ + a m n x n ) − b m ) ] = 1 2 ∗ 2 [ a 11 … a m 1 ⋮ ⋮ a 1 n … a n m ] [ ( a 11 x 1 + ⋯ + a 1 n x n ) − b 1 ⋮ ( a m 1 x 1 + ⋯ + a m n x n ) − b m ] \nabla_{x}f(x)=\frac{1}{2}*\begin{bmatrix} \frac{\partial[((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})^{2}+\dots+((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})^{2}]}{\partial x_{1}} \\ \vdots \\ \frac{\partial[((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})^{2}+\dots+((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})^{2}]}{\partial x_{n}} \end{bmatrix}\\= \begin{bmatrix} 2a_{11}((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})+\dots+2a_{m1}((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})\\ \vdots \\ 2a_{1n}((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})+\dots+2a_{mn}((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m}) \end{bmatrix}\\= \frac{1}{2}*2\begin{bmatrix} a_{11} & \dots & a_{m1} \\ \vdots & & \vdots \\ a_{1n} &\dots & a_{nm} \end{bmatrix}\begin{bmatrix} (a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1} \\ \vdots \\ (a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m} \end{bmatrix} xf(x)=21 x1[((a11x1++a1nxn)b1)2++((am1x1++amnxn)bm)2]xn[((a11x1++a1nxn)b1)2++((am1x1++amnxn)bm)2] = 2a11((a11x1++a1nxn)b1)++2am1((am1x1++amnxn)bm)2a1n((a11x1++a1nxn)b1)++2amn((am1x1++amnxn)bm) =212 a11a1nam1anm (a11x1++a1nxn)b1(am1x1++amnxn)bm         上式整理可得: ∇ x f ( x ) = A T ( A x − b ) = A T A x − A T b \nabla_{x}f(x)=A^{T}(Ax-b)=A^{T}Ax-A^{T}b xf(x)=AT(Axb)=ATAxATb


代码实现

import numpy as np
from random import randint
from functools import reduce
from sklearn.metrics import mean_squared_error


class GLSModel:
    def __init__(self):
        self.x = None

    def fit(self, x, y, e, epochs):
        """
        填充训练数据进行梯度更新
        :param x: 自变量数据
        :param y: 因变量数据
        :param e: 学习率
        :param epochs: 迭代次数
        """
        if x.shape[0] != y.shape[0]:
            raise ValueError("quantity of x must same as y")
        A = np.concatenate((x, np.ones((x.shape[0], 1))), axis=1)
        if self.x is None:
            self.x = np.random.random((A.shape[1], 1))
        for _ in range(epochs):
            self.x -= e * (reduce(np.matmul, (A.T, A, self.x)) - np.matmul(A.T, y))

    def predict(self, x):
        """
        进行预测
        :param x: 待预测数据
        :return: 预测数据
        """
        return np.matmul(np.concatenate((x, np.ones((x.shape[0], 1))), axis=1), self.x)

    def evaluate(self, x, y):
        y_ = self.predict(x)
        return mean_squared_error(y, y_)


train_x, train_y = [], []
for _ in range(1000):
    # 2 个自变量
    train_xi = [randint(-100, 100) for _ in range(2)]
    # 因变量
    train_yi = 3 * train_xi[0] + 5 * train_xi[1] + 4
    train_x.append(train_xi)
    train_y.append([train_yi])
train_x, train_y = np.array(train_x), np.array(train_y)
model = GLSModel()
model.fit(train_x, train_y, 2e-7, 100)
print(f'mse: {model.evaluate(train_x, train_y)}')
# mse: 9.080036995109756
test_x = [randint(-100, 100) for _ in range(2)]
test_y = 3 * test_x[0] + 5 * test_x[1] + 4
print(f'real: {test_y}, pred: {model.predict(np.array([test_x]))[0][0]}')
# real: 13, pred: 10.01059089304455

        读者可以自行调参尝试,以获得更优解。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值