#机器学习--实例--基于梯度优化的线性最小二乘法

最新推荐文章于 2024-10-04 08:54:00 发布

投笔丶从戎

最新推荐文章于 2024-10-04 08:54:00 发布

阅读量497

点赞数 2

分类专栏：机器学习文章标签：机器学习最小二乘法 python

本文链接：https://blog.csdn.net/qq_43519779/article/details/126248686

版权

机器学习专栏收录该内容

17 篇文章 0 订阅

订阅专栏

#机器学习--实例--基于梯度优化的线性最小二乘法

引言

本系列博客旨在为机器学习(深度学习)提供数学理论基础。因此内容更为精简，适合二次学习的读者快速学习或查阅。

推导过程

假设矩阵 $A'_{m,n-1}$ 为自变量数据矩阵， $b_{m}$ 为因变量向量，令 $A_{m,n}=[A',1]$ ，目标是找到一条直线 $f (x) = A x$ 使得值到直线的误差最小，因此我们需要采用梯度下降算法来找到最小化下式的 $x$ 的值： $f(x)=\frac{1}{2}||Ax-b||^{2}_{2}$ 本次带领读者完成一次完整的求导过程，之后就直接给结论了，首先将原函数进行展开： $f(x)=\frac{1}{2}[((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})^{2} + \dots + ((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})^{2}]$ 根据向量微积分 $\nabla_{x}f(x)=\frac{1}{2}*\begin{bmatrix} \frac{\partial[((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})^{2}+\dots+((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})^{2}]}{\partial x_{1}} \\ \vdots \\ \frac{\partial[((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})^{2}+\dots+((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})^{2}]}{\partial x_{n}} \end{bmatrix}\\= \begin{bmatrix} 2a_{11}((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})+\dots+2a_{m1}((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})\\ \vdots \\ 2a_{1n}((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})+\dots+2a_{mn}((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m}) \end{bmatrix}\\= \frac{1}{2}*2\begin{bmatrix} a_{11} & \dots & a_{m1} \\ \vdots & & \vdots \\ a_{1n} &\dots & a_{nm} \end{bmatrix}\begin{bmatrix} (a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1} \\ \vdots \\ (a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m} \end{bmatrix}$ 上式整理可得： $\nabla_{x}f(x)=A^{T}(Ax-b)=A^{T}Ax-A^{T}b$

代码实现

import numpy as np
from random import randint
from functools import reduce
from sklearn.metrics import mean_squared_error


class GLSModel:
    def __init__(self):
        self.x = None

    def fit(self, x, y, e, epochs):
        """
        填充训练数据进行梯度更新
        :param x: 自变量数据
        :param y: 因变量数据
        :param e: 学习率
        :param epochs: 迭代次数
        """
        if x.shape[0] != y.shape[0]:
            raise ValueError("quantity of x must same as y")
        A = np.concatenate((x, np.ones((x.shape[0], 1))), axis=1)
        if self.x is None:
            self.x = np.random.random((A.shape[1], 1))
        for _ in range(epochs):
            self.x -= e * (reduce(np.matmul, (A.T, A, self.x)) - np.matmul(A.T, y))

    def predict(self, x):
        """
        进行预测
        :param x: 待预测数据
        :return: 预测数据
        """
        return np.matmul(np.concatenate((x, np.ones((x.shape[0], 1))), axis=1), self.x)

    def evaluate(self, x, y):
        y_ = self.predict(x)
        return mean_squared_error(y, y_)


train_x, train_y = [], []
for _ in range(1000):
    # 2 个自变量
    train_xi = [randint(-100, 100) for _ in range(2)]
    # 因变量
    train_yi = 3 * train_xi[0] + 5 * train_xi[1] + 4
    train_x.append(train_xi)
    train_y.append([train_yi])
train_x, train_y = np.array(train_x), np.array(train_y)
model = GLSModel()
model.fit(train_x, train_y, 2e-7, 100)
print(f'mse: {model.evaluate(train_x, train_y)}')
# mse: 9.080036995109756
test_x = [randint(-100, 100) for _ in range(2)]
test_y = 3 * test_x[0] + 5 * test_x[1] + 4
print(f'real: {test_y}, pred: {model.predict(np.array([test_x]))[0][0]}')
# real: 13, pred: 10.01059089304455