《动手学深度学习 Pytorch版》 2.4 微积分

AncilunKiang

已于 2023-06-08 22:29:29 修改

阅读量146

点赞数 3

分类专栏：《动手学深度学习 Pytorch版》学习笔记文章标签：深度学习 pytorch python

于 2023-06-08 22:17:09 首次发布

本文链接：https://blog.csdn.net/qq_43941037/article/details/131117003

版权

《动手学深度学习 Pytorch版》学习笔记专栏收录该内容

65 篇文章 31 订阅

订阅专栏

本文介绍了导数和微分的概念，通过Python代码展示了如何计算函数的导数值，并使用matplotlib进行可视化。接着讨论了偏导数和梯度，以及它们在多变量函数中的应用。最后提到了链式法则，并给出了相关练习题。

摘要由CSDN通过智能技术生成

2.4.1 导数和微分

导数是啥无需多讲，可以代码实现求 $f(x)=3x^2-4x$ 在 $x = 1$ 处的导数的趋近值

%matplotlib inline
import numpy as np
from matplotlib_inline import backend_inline
from d2l import torch as d2l

def f(x):  # 定义f(X)
    return 3 * x ** 2 - 4 * x

def numerical_lim(f, x, h):  # 求导函数
    return (f(x + h) - f(x)) / h

h = 0.1
for i in range(5):
    print(f'h={h:.5f}, numerical limit={numerical_lim(f, 1, h):.5f}')
    h *= 0.1

h=0.10000, numerical limit=2.30000
h=0.01000, numerical limit=2.03000
h=0.00100, numerical limit=2.00300
h=0.00010, numerical limit=2.00030
h=0.00001, numerical limit=2.00003

使用matplotlib对导数的这种解释进行可视化。

ps：#@save是d2l包的标记，用来把函数、类或者语句保存在d2l包中，以后无需定义即可调用。

def use_svg_display():  #@save
    """使用svg格式在Jupyter中显示绘图"""
    backend_inline.set_matplotlib_formats('svg')

def set_figsize(figsize=(3.5, 2.5)):  #@save
    """设置matplotlib的图表大小"""
    use_svg_display()
    d2l.plt.rcParams['figure.figsize'] = figsize  # 这里可以直接使用d2l.plt是因为导入语句 from matplotlib import pyplot as plt已标记为保存到d2l包中

#@save
def set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend):
    """设置matplotlib的轴"""
    axes.set_xlabel(xlabel)
    axes.set_ylabel(ylabel)
    axes.set_xscale(xscale)
    axes.set_yscale(yscale)
    axes.set_xlim(xlim)
    axes.set_ylim(ylim)
    if legend:
        axes.legend(legend)
    axes.grid()

#@save
def plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None,
         ylim=None, xscale='linear', yscale='linear',
         fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None):
    """绘制数据点"""
    if legend is None:
        legend = []

    set_figsize(figsize)
    axes = axes if axes else d2l.plt.gca()

    # 如果X有一个轴，输出True
    def has_one_axis(X):
        return (hasattr(X, "ndim") and X.ndim == 1 or isinstance(X, list)
                and not hasattr(X[0], "__len__"))

    if has_one_axis(X):
        X = [X]
    if Y is None:
        X, Y = [[]] * len(X), X
    elif has_one_axis(Y):
        Y = [Y]
    if len(X) != len(Y):
        X = X * len(Y)
    axes.cla()
    for x, y, fmt in zip(X, Y, fmts):
        if len(x):
            axes.plot(x, y, fmt)
        else:
            axes.plot(y, fmt)
    set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)

x = np.arange(0, 3, 0.1)
plot(x, [f(x), 2 * x - 3], 'x', 'f(x)', legend=['f(x)', 'Tangent line (x=1)'])

在这里插入图片描述

2.4.2 偏导数

无需多言

2.4.3 梯度

梯度（gradient）向量也就是一个包含n个偏导数的向量：
$\nabla_xf(x)=\left[\frac{\partial f(x)}{\partial x_1},\frac{\partial f(x)}{\partial x_2},\dots,\frac{\partial f(x)}{\partial x_3}\right]^T$

2.4.4 链式法则

无需多言

练习

（1）绘制函数 $y=f(x)=x^3-\frac{1}{x}$ 和其在 $x = 1$ 处切线的图像。

def g(x):  # 定义g(X)
    return x ** 3 - 1 / x

x = np.arange(0, 3, 0.1)
plot(x, [g(x), 4 * x - 4], 'x', 'g(x)', legend=['g(x)', 'Tangent line (x=1)'])

C:\Users\AncilunKiang\AppData\Local\Temp\ipykernel_8840\2032550329.py:2: RuntimeWarning: divide by zero encountered in true_divide
  return x ** 3 - 1 / x

在这里插入图片描述

（2）求函数 $f(x)=3x_1^2+5e^{x_2}$ 的梯度值

求两个偏导，写成向量形式即可。

对 $x_1$ 求偏导得： $f'(x_1)=6x_1$

对 $x_2$ 求偏导得： $f'(x_1)=5e^{x_2}$

最后得函数 $f (x)$ 相对于 $x$ 的梯度是：
$\nabla_xf(x)=\left[6x_1\ ,\ 5e^{x_2}\right]^T$

（3）函数 $f(x)=||x||_2$ 的梯度是什么？

第二范数的表达式时：
$f(x)=||x||_2=\sqrt{\sum^n_{i=1}x^2_i}=\sqrt{x_1^2+x_2^2+\dots+x_n^2}$

求偏导是:
$\frac{\partial f(x)}{\partial x_k}=\frac{x_k}{\sqrt{\sum^n_{i=1}x^2_i}}$

最后得函数 $f (x)$ 相对于 $x$ 的梯度是：
$\nabla_xf(x)=\left[\frac{x_1}{\sqrt{\sum^n_{i=1}x^2_i}},\frac{x_2}{\sqrt{\sum^n_{i=1}x^2_i}},\dots,\frac{x_n}{\sqrt{\sum^n_{i=1}x^2_i}}\right]^T$

（4）尝试写出函数 $u = f (x, y, z)$ 的链式法则，其中 $x = x (a, b)$ ， $y = y (a, b)$ ， $z = z (a, b)$ 。

$\frac{\partial u}{\partial a}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial a}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial a}+\frac{\partial f}{\partial z}\frac{\partial z}{\partial a}$

$\frac{\partial u}{\partial b}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial b}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial b}+\frac{\partial f}{\partial z}\frac{\partial z}{\partial b}$