梯度下降法和线性回归：综合指南

Jr_l

已于 2024-07-22 13:21:57 修改

阅读量673

点赞数 13

分类专栏： # 优化算法文章标签：线性回归算法回归

于 2024-07-22 13:19:42 首次发布

本文链接：https://blog.csdn.net/LS_Ai/article/details/140606876

版权

优化算法专栏收录该内容

8 篇文章 0 订阅

订阅专栏

引言

梯度下降法是一种常用的优化算法，用于在各种机器学习算法中最小化成本函数，包括线性回归。本文将初步解释梯度下降法和其基本原理，并通过一个使用Python拟合数据的线性模型的实际例子进行演示。

梯度下降法的原理

梯度下降法通过迭代地向最陡下降方向（即梯度的反方向）移动，寻找函数的最小值。该算法通过更新参数来减少成本函数的值。

关键概念

学习率（α：决定每次迭代的步长。学习率过小会导致收敛速度慢，学习率过大可能会导致算法跳过最小值。
成本函数（J）：衡量模型预测值与实际数据之间的差异。在线性回归中，成本函数通常是均方误差（MSE）。
梯度（∇J：成本函数相对于每个参数的偏导数向量。它指向函数值增长最快的方向。

核心公式

梯度下降中的参数更新规则为：

其中：

θi为第ⅰ次迭代的参数向量，
α 为学习率，
∇J(θi)为成本函数在 θi 处的梯度。

Python示例：拟合线性模型

让我们通过Python示例演示梯度下降法，拟合线性模型到数据集。我们将生成随机的线性数据集，应用梯度下降法，并可视化结果。

第一步：生成随机数据

import numpy as np
import matplotlib.pyplot as plt

# 设置随机数种子以保证结果可重复
np.random.seed(0)

# 生成随机数据
X = 2 * np.random.rand(1000, 1)
y = 4 + 3 * X + np.random.randn(1000, 1)

# 绘制原始数据
plt.figure(figsize=(10, 6))
plt.scatter(X, y, c='b', label='原始数据')
plt.xlabel('X')
plt.ylabel('y')
plt.title('梯度下降法下的线性回归')
plt.legend()
plt.grid(True)
plt.show()

第二步：添加偏置项到数据

# 添加偏置项 (x0 = 1) 到数据集
X_b = np.c_[np.ones((1000, 1)), X]

第三步：定义梯度下降函数

def gradient_descent(X, y, theta, learning_rate=0.01, iterations=100):
    m = len(y)
    history = {'cost': []}
    
    for iteration in range(iterations):
        gradients = 2/m * X.T.dot(X.dot(theta) - y)
        theta = theta - learning_rate * gradients
        cost = np.mean((X.dot(theta) - y) ** 2)
        history['cost'].append(cost)
    
    return theta, history

第四步：初始化参数并运行梯度下降算法

# 初始化参数 theta
theta_initial = np.random.randn(2, 1)

# 运行梯度下降算法
theta_best, history = gradient_descent(X_b, y, theta_initial, learning_rate=0.1, iterations=1000)

第五步：绘制成本函数随迭代次数的变化

# 绘制成本函数随迭代次数的变化
plt.figure(figsize=(10, 6))
plt.plot(history['cost'], c='r')
plt.xlabel('迭代次数')
plt.ylabel('成本')
plt.title('成本函数随迭代次数的变化')
plt.grid(True)
plt.show()

第六步：绘制拟合直线和原始数据

# 绘制拟合直线和原始数据
plt.figure(figsize=(10, 6))
plt.scatter(X, y, c='b', label='原始数据')
plt.plot(X, X_b.dot(theta_best), c='r', label='拟合直线')
plt.xlabel('X')
plt.ylabel('y')
plt.title('线性回归及拟合直线')
plt.legend()
plt.grid(True)
plt.show()

案例：预测房价

我们将使用一个模拟的房价数据集，其中包括多个影响房价的特征，例如房屋面积、房间数和建筑年龄。我们将使用线性回归模型，通过梯度下降法来预测房价。

案例描述

假设我们有一个数据集，记录了某个城市中多套房屋的特征和对应的价格。特征包括：

房屋面积（平方英尺）
房间数
建筑年龄（年）

目标是基于这些特征预测房屋的价格。

数据生成

首先，我们生成一个模拟数据集：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

# 设置随机数种子以保证结果可重复
np.random.seed(42)

# 生成随机数据
m = 1000  # 样本数量
X1 = 2.5 * np.random.rand(m, 1) + 1.5  # 房屋面积（1.5到4.0之间）
X2 = np.random.randint(1, 5, size=(m, 1))  # 房间数（1到4之间）
X3 = np.random.randint(1, 100, size=(m, 1))  # 建筑年龄（1到100年之间）

# 假设真实的房价公式为：价格 = 10000 + 50000 * 面积 + 30000 * 房间数 - 200 * 建筑年龄 + 随机噪声
true_theta = np.array([[10000], [50000], [30000], [-200]])  # 实际的参数，包括偏置项
y = 10000 + 50000 * X1 + 30000 * X2 - 200 * X3 + np.random.randn(m, 1) * 10000

# 合并所有特征
X = np.c_[X1, X2, X3]

# 特征缩放
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# 添加偏置项 x0 = 1 到数据中
X_b = np.c_[np.ones((m, 1)), X_scaled]

# 打印前5条数据
print("前5条数据：")
print(X_b[:5])
print("前5个价格：")
print(y[:5])

定义梯度下降函数

def gradient_descent_multi(X, y, theta, learning_rate=0.001, iterations=1000):
    m = len(y)
    history = {'cost': []}

    for iteration in range(iterations):
        gradients = 2/m * X.T.dot(X.dot(theta) - y)
        theta = theta - learning_rate * gradients
        cost = np.mean((X.dot(theta) - y) ** 2)
        history['cost'].append(cost)

        # 每100次迭代打印一次成本
        if iteration % 100 == 0:
            print(f"迭代次数: {iteration}, 成本: {cost:.2f}")

    return theta, history

初始化参数并运行梯度下降算法

# 初始化参数 theta
theta_initial = np.random.randn(4, 1)  # 包括偏置项

# 运行梯度下降算法
learning_rate = 0.001
iterations = 2000
theta_best, history = gradient_descent_multi(X_b, y, theta_initial, learning_rate, iterations)

绘制成本函数随迭代次数的变化

# 绘制成本函数随迭代次数的变化
plt.figure(figsize=(10, 6))
plt.plot(history['cost'], c='r')
plt.xlabel('The number of iterations')
plt.ylabel('cost')
plt.title('The change in the cost function with the number of iterations')
plt.grid(True)
plt.show()

比较真实参数和拟合参数

print("真实参数：\n", true_theta)
print("拟合参数：\n", theta_best)

预测房价

我们可以使用拟合的参数预测新的房价：

# 新的房屋特征：面积=2.0平方英尺，房间数=3，建筑年龄=20年
# 特征缩放
X_new = np.array([[2.0, 3, 20]])
X_new_scaled = scaler.transform(X_new)
X_new_b = np.c_[np.ones((1, 1)), X_new_scaled]y
y_predict = X_new_b.dot(theta_best)
print("预测房价：", y_predict)