初识机器学习 | 5.梯度下降

最新推荐文章于 2024-07-15 09:00:00 发布

小哲嗨数

最新推荐文章于 2024-07-15 09:00:00 发布

阅读量227

点赞数

分类专栏：【机器学习】文章标签：机器学习 python

本文链接：https://blog.csdn.net/ganzheyu/article/details/105046124

版权

本文详细介绍了梯度下降法在机器学习中的应用，包括如何使用梯度下降求解一元二次方程、在线性回归模型中的实现（循环方式与向量化计算）、梯度下降法的不同变种（批量、随机、小批量）及其比较，以及如何调试梯度。此外，还探讨了梯度下降与正规方程解的优缺点。

摘要由CSDN通过智能技术生成

import numpy as np
import matplotlib.pyplot as plt

%matplotlib
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

Using matplotlib backend: MacOSX

梯度下降求解一元二次方程

$y = (x-2.5)^2 -1$

x = np.linspace(-1,6,200)
y = (x - 2.5)**2 - 1

plt.plot(x, y)
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-xq7J7mUM-1584938277770)(output_2_0.png)]

def j(theta):
    """一元二次方程"""
    try:
        return (theta - 2.5) ** 2 - 1
    except:
        return float('inf')

def dj(theta):
    """求导"""
    return 2 * (theta - 2.5)

def gradient_descent(theta=0.0, eta=0.01, epsilon=1e-8, max_iters=10000):
    """
    theta: 参数
    eta: 学习率
    epsilon: 最小值
    max_iters: 最大尝试次数
    """
    theta_history = [theta]
    while max_iters>0:
        gradient = dj(theta)
        last_theta = theta
        theta = theta - eta * gradient
        theta_history.append(theta)
        
        if(abs(j(theta)-j(last_theta)) < epsilon):
            break
        max_iters -= 1
    print('theta: ', theta)
    print('min j(theta): ', j(theta))
    print('theta_history length: ', len(theta_history))
    plt.plot(x, y)
    plt.plot(np.array(theta_history), j(np.array(theta_history)), color="r", marker='+')
    plt.show()

# 使用默认参数
gradient_descent()

theta:  2.4995140741236224
min j(theta):  -0.9999997638760426
theta_history length:  424

在这里插入图片描述

# 当学习率eta比较小时，下降的步子很小。程序需要循环很多次才会找到最小值。 一般学习率会设置为0.01
gradient_descent(eta=0.001)

theta:  2.4984243400819484
min j(theta):  -0.9999975172958226
theta_history length:  3682

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RsxQlmRr-1584938277784)(output_5_1.png)]

# 当学习率eta较大时候, theta跳到右边
gradient_descent(eta=0.8)

theta:  2.500054842376601
min j(theta):  -0.9999999969923137
theta_history length:  22

在这里插入图片描述

# 当特别大时，如1.5. 程序会陷入死循环, 所以限定递归次数为500
gradient_descent(eta=1.5, max_iters=500)

theta:  -8.183476519740352e+150
min j(theta):  6.696928794914166e+301
theta_history length:  501

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-0gb2qpZc-1584938277788)(output_7_1.png)]

线性回归模型中使用梯度下降

损失函数:

$J(\theta) = \frac{1}{m} \sum_{i=1}^{m}\left(y^{(i)}-\hat{y}^{(i)}\right)^{2}$

$\nabla J(\boldsymbol{\theta})=\left(\begin{array}{c} \partial J / \partial_{\theta_{0}} \\ \partial J / \partial \theta_{1} \\ \partial J / \partial_{\theta_{2}} \\ \ldots \\ \partial J_{\partial \theta_{n}} \end{array}\right)=\frac{2}{m} \cdot\left(\begin{array}{c} \sum_{i=1}^{m}\left(X_{b}^{(i)} \theta-y^{(0)}\right) \\ \sum_{i=1}^{m}\left(X_{b}^{(i)} \theta-y^{(i)}\right) \cdot X_{1}^{(i)} \\ \sum_{i=1}^{m}\left(X_{b}^{(i)} \theta-y^{(i)}\right) \cdot X_{2}^{(i)} \\ \cdots \\ \sum_{i=1}^{m}\left(X_{b}^{(i)} \theta-y^{(i)}\right) \cdot X_{n}^{(i)} \end{array}\right.$

最低0.47元/天解锁文章

小哲嗨数

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
初识机器学习 | 5.梯度下降

import numpy as npimport matplotlib.pyplot as plt%matplotlib%matplotlib inline%config InlineBackend.figure_format = 'retina'Using matplotlib backend: MacOSX梯度下降求解一元二次方程y=(x−2.5)2−1y = (x-2....
复制链接

扫一扫

专栏目录