[Python3] 机器学习之手写梯度下降并绘制theta和loss

最新推荐文章于 2024-02-08 21:08:05 发布

kikook

最新推荐文章于 2024-02-08 21:08:05 发布

阅读量1.3k

点赞数 2

分类专栏：强化学习

本文链接：https://blog.csdn.net/chenhanxuan1999/article/details/104832476

版权

强化学习专栏收录该内容

7 篇文章 0 订阅

订阅专栏

如题，本文使用Python3手写梯度下降并进行数据可视化

参数设置：

initial_theta=0（任意选取，这里取0）

eta=0.05（步长）

n_iters=1000（最大迭代次数

epslion=1e-8（迭代精度，提前退出条件）

1. 效果图

1.1 一次函数梯度下降

损失函数 $loss\left ( \theta \right ) = 2 * \theta - 3$

可以看到在设定的最大迭代次数1000结束之后依然可以继续下降，无法收敛

1.2 二次函数梯度下降

损失函数 $loss\left ( \theta \right ) = (\theta - 3)^{2}$

可以看到由于epslion=1e-8的设置，在 80次之后，100次之前就达到精度要求并收敛

2. 代码

import numpy as np
import warnings
import matplotlib.pyplot as plt

warnings.filterwarnings("ignore")


def draw_process(x_start, x_end, func=None, y_list=None):
    plot_x_iter_times = np.linspace(x_start, x_end, x_end-x_start+1)  # 在x_start到x_end之间等距的生成x_end-x_start+1个数

    if func is not None and y_list is not None:
        plot_y_loss = [func(ele) for ele in y_list]  # 根据传入的theta列表来生成loss
        plot_y_theta = y_list  # 此时传入的直接是theta_list，绘制迭代过程中theta的变化
    else:
        return

    plt.xlabel('Iteration Times')  # 设置横轴坐标
    plt.ylabel('Loss && Theta')  # 设置纵轴坐标
    plt.title('Gradient Descent Visualization')  # 设置标题

    plt.plot(plot_x_iter_times, plot_y_loss, color="blue", linewidth=2.5, linestyle="-", label="Loss")
    plt.plot(plot_x_iter_times, plot_y_theta, color="red", linewidth=2.5, linestyle="-", label="Theta")
    plt.legend(loc='upper left')   # 标记放左上角

    plt.show()


def gradient_descent(initial_theta, eta=0.05, n_iters=1000, epslion=1e-8):
    '''
    梯度下降
    :param initial_theta: 参数初始值，类型为float
    :param eta: 学习率，类型为float
    :param n_iters: 训练轮数，类型为int
    :param epslion: 容忍误差范围，类型为float
    :return: 训练后得到的参数
    '''

    #   请在此添加实现代码   #
    # ********** Begin *********#
    def cal_gradient(theta):
        return 2 * theta - 6  # loss 为二次函数时，梯度为一次函数与theta有关
        # return 2            # loss为一次函数时梯度为常数2, 与传入参数 theta 无关

    def loss_func(theta):  # 损失函数，通过优化 theta 使得该函数最小
        return (theta - 3) ** 2  # loss 为二次函数 loss(theta) = (theta - 3) ** 2
        # return 2 * (theta - 3)  # loss 为一次函数 loss(theta) = 2 * (theta - 3)

    cur_theta = initial_theta  # 记录当前的 theta 值
    latest_theta = cur_theta  # 用来记录上一次的theta值，用于计算epslion提前结束

    theta_list = [latest_theta, ]  # 准备迭代中的theta数据用于后续绘图观察theta && loss变化

    for i in range(n_iters):  # 最大迭代次数为n_iters
        #  更新权重
        cur_theta = cur_theta - eta * cal_gradient(cur_theta)  # 更新theta
        if abs(loss_func(cur_theta) - loss_func(latest_theta)) < epslion:  # 更新率过低，直接退出
            break
        latest_theta = cur_theta  # 存储当前的theta用于下一轮计算
        theta_list.append(latest_theta)
    # ********** End **********#

    return cur_theta, theta_list, loss_func


def main():
    final_theta, theta_list, loss_func = gradient_descent(0, eta=0.05, n_iters=1000, epslion=1e-8)
    print("final theta = {}".format(final_theta))
    iter_times = len(theta_list)  # 获得迭代的次数
    draw_process(0, iter_times - 1, func=loss_func, y_list=theta_list)  # 监视迭代过程的theta && loss 变化


if __name__ == "__main__":
    main()