基础算法-线性回归

架构菜芽

已于 2022-03-06 11:45:04 修改

阅读量310

点赞数

分类专栏：机器学习-算法汇总文章标签：机器学习线性代数算法

于 2021-05-14 17:47:43 首次发布

本文链接：https://blog.csdn.net/weixin_41175904/article/details/116799742

版权

机器学习-算法汇总专栏收录该内容

13 篇文章 2 订阅

订阅专栏

1.线性回归描述

1.1 什么是回归分析

回归分析是一种预测性的建模技术，它研究的是因变量（目标）和自变量（预测器）之间的关系。这种技术通常用于预测分析，时间序列模型以及发现变量之间的因果关系。通常使用曲线来拟合数据点，目标是使曲线到数据点的距离差异最小。

1.2 线性回归

线性回归是回归问题中的一种，线性回归假设目标值与特征之间线性相关，即满足一个多元一次方程。通过构建损失函数，来求解损失函数最小时的参数w和b。通长我们可以表达成如下公式：

y^为预测值，自变量x和因变量y是已知的，而我们想实现的是预测新增一个x，其对应的y是多少。因此，为了构建这个函数关系，目标是通过已知数据点，求解线性模型中w和b两个参数。

1.3 目标/损失函数

求解最佳参数，需要一个标准来对结果进行衡量，为此我们需要定量化一个目标函数式，使得计算机可以在求解过程中不断地优化。

针对任何模型求解问题，都是最终都是可以得到一组预测值y^ ，对比已有的真实值 y ，数据行数为 n ，可以将损失函数定义如下：

即预测值与真实值之间的平均的平方距离，统计中一般称其为MAE(mean square error)均方误差。把之前的函数式代入损失函数，并且将需要求解的参数w和b看做是函数L的自变量，可得

现在的任务是求解最小化L时w和b的值，

即核心目标优化式为

2.线性回归公式

一元回归公式推导
使每一个样本的预测值与真实值的差的平方和最小
对 w 和 b 分别求偏导，并令二者的偏导数为零

分别令二者等于0便可以得到 w和 b的最优解：
得出b的值为（简单）：
w的值求解过程：
带入b的值
最右的平方项移项到左边可得
得出w的值为
多元回归公式推导
我们想求出一组使得这组无限逼近上述方程中的，又因为真实的是未知的，我们无法直接比较两者的误差关系，所以我们引入Lose Function即误差函数来描述我们求得的，除以N是为了方便后续简化化简
又因为，若设有：
则FunctionLoseFunction可改写为：
上式J对求偏导的结果为：
变换为：
即为：
这样求完所有的偏导，构成该点的梯度：
若此时估计的为：
若学习率为η ，则下一次迭代的为：
有的地方将除以N凑成除以2这样后边的因数化简为

3.线性回归代码

一元线性回归

"""
批量梯度下降(一元)
y=theta0+theta1*X
"""

#样本
X = [4, 8, 5, 10, 12]
y = [20, 50, 30, 70, 60]
#初始化参数
theta0 = theta1 = 0
#学习率（步长）
alpha = 0.0001
#迭代次数
cnt = 0
#误差
error0 = error1=0
#指定阈值，用于检查两次误差的差，以便停止迭代
threshold=0.0000001
while True:
    #梯度diff[0]是theta0的梯度，diff[1]是theta1的梯度
    diff=[0,0]
    m=len(X)
    for i in range (m):
        diff[0] +=y[i]-(theta0+theta1*X[i])
        diff[1] +=(y[i]-(theta0+theta1*X[i]))*X[i]
    theta0 = theta0+alpha/m*diff[0]
    theta1 = theta1+alpha/m*diff[1]

    #计算误差(采用均方误差)
    for i in range(m):
        error1+=(y[i]-(theta0+theta1*X[i]))**2
    error1/=m
    if abs(error1-error0)<threshold:
        break
    else:
        error0=error1
    cnt+=1
print(theta0,theta1,cnt)
def predict(theta0,theta1,x_test):
    return theta0+theta1*x_test
print(predict(theta0,theta1,15))


"""
随机梯度下降(一元)
y=theta0+theta1*X
"""

# 样本
X = [4, 8, 5, 10, 12]
y = [20, 50, 30, 70, 60]
# 初始化参数
theta0 = theta1 = 0
# 学习率（步长）
alpha = 0.0001
# 迭代次数
cnt = 0
# 误差
error0 = error1 = 0
# 指定阈值，用于检查两次误差的差，以便停止迭代
threshold = 0.0000001
while True:
    # 梯度diff[0]是theta0的梯度，diff[1]是theta1的梯度
    diff = [0, 0]
    m = len(X)
    for i in range(m):
        diff[0] = y[i] - (theta0 + theta1 * X[i])
        diff[1] = (y[i] - (theta0 + theta1 * X[i])) * X[i]
    theta0 = theta0 + alpha * diff[0]
    theta1 = theta1 + alpha * diff[1]

    # 计算误差(采用均方误差)
    for i in range(m):
        error1 += (y[i] - (theta0 + theta1 * X[i])) ** 2
    error1 /= m
    if abs(error1 - error0) < threshold:
        break
    else:
        error0 = error1
    cnt += 1
print(theta0, theta1, cnt)
def predict(theta0, theta1, x_test):
    return theta0 + theta1 * x_test
print(predict(theta0, theta1, 15))

"""
微批量梯度下降(一元)
y=theta0+theta1*X
"""
#样本
X = [4, 8, 5, 10, 12]
y = [20, 50, 30, 70, 60]
#初始化参数
theta0 = theta1 = 0
#学习率（步长）
alpha = 0.0001
#迭代次数
cnt = 0
#误差
error0 = error1=0
#指定阈值，用于检查两次误差的差，以便停止迭代
threshold=0.0000001
while True:
    #梯度diff[0]是theta0的梯度，diff[1]是theta1的梯度
    diff=[0,0]
    m=len(X)
    for i in range (0,m,2):
        diff[0] +=y[i]-(theta0+theta1*X[i])
        diff[1] +=(y[i]-(theta0+theta1*X[i]))*X[i]
    theta0 = theta0+alpha/m*diff[0]
    theta1 = theta1+alpha/m*diff[1]

    #计算误差(采用均方误差)
    for i in range(m):
        error1+=(y[i]-(theta0+theta1*X[i]))**2
    error1/=m
    if abs(error1-error0)<threshold:
        break
    else:
        error0=error1
    cnt+=1
print(theta0,theta1,cnt)
def predict(theta0,theta1,x_test):
    return theta0+theta1*x_test
print(predict(theta0,theta1,15))

多元线性回归

"""
批量梯度下降(多元)
theta0+theta1*X+theta2*X2
"""

#样本
X = [[1 , 0,3], [1,1,3],[1,2,3],[1,3,2],[1,4,4]]
y = [95.364, 97.217205, 75.195834, 60.105519, 49.342380]
#初始化参数
theta0 = theta1 = theta2=0
#学习率（步长）
alpha = 0.0001
#迭代次数
cnt = 0
#误差
error0 = error1=0
#指定阈值，用于检查两次误差的差，以便停止迭代
threshold=0.0000001
#样本个数
m=len(X)

while True:
    #梯度diff[0]是theta0的梯度，diff[1]是theta1的梯度
    diff=[0,0,0]
    for i in range (m):
        diff[0] +=y[i]-(theta0+theta1*X[i][1]+theta2*X[i][2])*X[i][0]
        diff[1] +=y[i]-(theta0+theta1*X[i][1]+theta2*X[i][2])*X[i][1]
        diff[2] +=y[i]-(theta0+theta1*X[i][1]+theta2*X[i][2])*X[i][2]
    theta0 = theta0+alpha/m*diff[0]
    theta1 = theta1+alpha/m*diff[1]
    theta2 = theta2+alpha/m*diff[2]

    #计算误差(采用均方误差)
    for i in range(m):
        error1+=(y[i]-(theta0+theta1*X[i][1]+theta2*X[i][2]))**2
    error1/=m
    if abs(error1-error0)<threshold:
        break
    else:
        error0=error1
    cnt+=1
print(theta0,theta1,theta2,cnt)

# def predict(theta0,theta1,x_test):
#     return theta0+theta1*x_test
# print(predict(theta0,theta1,15))

逻辑回归

# y=theta0+theta1*x1+theta2*x2

import numpy as np


# 函数
def sigmoid(x):
    return 1 / (1 + np.exp(-x))


# 预测
def predict(x_test, weights):
    if sigmoid(weights.T @ x_test) > 0.5:
        return 1
    else:
        return 0


def weights(x_train, y_train):
    m, n = x_train.shape
    # 初始化theta
    theta = np.random.rand(n)
    # 学习率
    alpha = 0.001

    # 样本数量
    m = len(y_train)

    # 误差
    # 迭代次数
    cnt = 0
    # 最大迭代次数
    max_iter = 50000

    # 指定阈值
    threshold = 0.01

    while cnt < max_iter:
        cnt += 1
        diff = np.full(n, 0)
        for i in range(m):
            diff = (y_train[i] - sigmoid(theta.T @ x_train[i])) * x_train[i]
            theta = theta + alpha * diff
        if (abs(diff) < threshold).all():
            break
    print(cnt)
    return theta


if __name__ == "__main__":
    x_train = np.array([[1, 2.697, 6.254],
                        [1, 1.872, 2.014],
                        [1, 2.312, 0.812],
                        [1, 1.983, 4.990],
                        [1, 0.932, 3.920],
                        [1, 1.321, 5.583],
                        [1, 2.215, 1.560],
                        [1, 1.659, 2.932],
                        [1, 0.865, 7.362],
                        [1, 1.685, 4.763],
                        [1, 1.786, 2.523]
                        ])

    # y[i]样本点对应的输出
    y_train = np.array([1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1])
    weights_value = weights(x_train, y_train)
    print(weights_value)
    for i in range(len(y_train)):
        print(predict(x_train[i], weights_value))