动手深度学习-2.4 微积分

Ryan-Lily

已于 2023-06-15 21:33:21 修改

阅读量125

点赞数

文章标签：深度学习学习笔记

于 2023-06-11 22:29:47 首次发布

本文链接：https://blog.csdn.net/ye13213/article/details/131150230

版权

拟合模型的任务分解为两个关键问题：优化和泛化。

优化：用模型拟合观测数据的过程。
泛化：生成有效性超过用于训练的数据集本身的模型。
2.4 微积分

2.4.1 导数和微分

导数定义为：当自变量的增量趋于零时，因变量的增量与自变量的增量之商的极限
$\lim_{h \to 0}\frac{f(x + h) - f(x)}{h}$

%matplotlib inline
import numpy as np
from matplotlib_inline import backend_inline
from d2l import torch as d2l

def f(x):
    return 3 * x ** 2 - 4 * x

def numerical_lim(f, x, h):
    return (f(x + h) - f(x)) / h

h = 0.1
for i in range(5):
    print(f'h = {h:.5f}, numerical limit = {numerical_lim(f, 1, h):.5f}')

导数的几个等价符号
$=\frac{\mathrm{d} y}{\mathrm{d} x} =\frac{\mathrm{d} f}{\mathrm{d} x} =\frac{\mathrm{d}}{\mathrm{d} x}f(x)=Df(x)=D_{x}f(x)$
基础导数公式
$\begin{align} DC & = 0(C是一个常数)\\ Dx^{n} & = nx^{n-1}\\ De^{x} & = e^{x}\\ Dln(x) & = 1/x\\ \end{align}$
导数的基本运算法则
常数相乘法则
$\frac{\mathrm{d} [Cf(x)]}{\mathrm{d} x} =C\frac{\mathrm{d}}{\mathrm{d} x}f(x)$
加法法则
$\frac{\mathrm{d} [f(x)+g(x)]}{\mathrm{d} x} =\frac{\mathrm{d}}{\mathrm{d} x}f(x)+\frac{\mathrm{d}}{\mathrm{d} x}g(x)$
乘法法则
$\frac{\mathrm{d}[f(x)g(x)]}{\mathrm{d} x}=f(x)\frac{\mathrm{d}}{\mathrm{d} x}[g(x)]+g(x)\frac{\mathrm{d}}{\mathrm{d} x}[f(x)]$
除法法则
$\frac{d}{d x}\left[\frac{f(x)}{g(x)}\right]=\frac{g(x) \frac{d}{d x}[f(x)]-f(x) \frac{d}{d x}[g(x)]}{[g(x)]^{2}}$
导数可视化
定义三个函数来配置matplotlib生成图形的属性

use_svg_display()函数使用svg格式显示绘图

def use_svg_display():
	backend_inline.set_matplotlib_formats('svg')

set_figsize()函数设置图表大小

def set_figsize(figsize = (3.5, 2.5)):
	use_svg_display()
	d2l.plt.set_matplotlib_rcParams['figure.figsize'] = figsize

set_axes()函数设置图表的轴属性

def set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend):
    axes.set_xlabel(xlabel)
    axes.set_ylabel(ylabel)
    axes.set_xlim(xlim)
    axes.set_ylim(ylim)
    axes.set_xscale(xscale)
    axes.set_yscale(yscale)
    if legend:
        axes.legend(legend)
    axes.grid()

定义一个plot函数来绘制多条曲线

def plot(X, Y = None, xlabel = None, ylabel = None, legend = None, xlim = None, ylim = None, xscale = 'linear', yscale = 'linear', fmts = ('-', 'm--', 'g-.', 'r:'), figsize = (3.5, 2.5), axes = None):
    if legend is None:
        legend = []
    
    set_figsize(figsize)
    axes = axes if axes else d2l.plt.gca()
    
    def has_one_axis(X):
        return (hassttr(X, "ndim") and X.ndim == 1 or isinstance(X, list) and not hasattr(X[0], "__len__"))
    
    if has_one_axis(X):
        X = [X]
    if Y is None:
        X, Y = [[]] * len(X), X
    elif has_one_axis(Y):
        Y = [Y]
    if len(X) != len(Y):
        X = X * len(Y)
    axes.cla()
    for x, y, fmt in zip(X, Y, fmts):
        if len(x):
            axes.plot(x, y, fmt)
        else:
            axes.plot(y, fmt)
    set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)

绘制函数f(x)及其在x = 1处的切线y = 2x - 3

x = np.arange(0, 3, 0.1)
plot(x, [f(x), 2 *  x - 3], 'x', 'f(x)', legend = ['f(x)', 'Tagent line (x = 1)'])

2.4.2 偏导数

设 $y = f(x_{1} , x_{2} , ..., x_{n})$ 是一个具有n个变量的函数，y关于第i个参数 $x_{i}$ 的偏导数为：
$\frac{\partial y}{\partial x_{i}} =\lim_{h \to 0} \frac{f(x_{1},...,x_{i-1},x_{i}+h,x_{i+1},...,x_{n})-f(x_{1},...,x_{i},...,x_{n})}{h}$
对于偏导数的表示，以下是等价的：
$\frac{\partial y}{\partial x_{i}}= \frac{\partial f}{\partial x_{i}}=f_{x_{i}}=f_{i}=D_{i}f=D_{x_{i}}f$

2.4.3 梯度

梯度向量：多元函数对其所有变量的偏导数
设函数 $f:R^{n}\to R$ 的输入是一个n维向量 $x = [x_{1},x_{2},...,x_{n}]^{T}$ ，并且输出是一个标量。函数 $f (x)$ 相对于 $x$ 的梯度是一个包含n个偏导数的向量：
$\nabla _{\mathbf{x} }f(\mathbf{x} )=[\frac{\partial f(\mathbf{x} )}{\partial x_{1}},\frac{\partial f(\mathbf{x} )}{\partial x_{2}},...,\frac{\partial f(\mathbf{x} )}{\partial x_{n}}]^{T}$

2.4.4 链式法则

链式法则可以被用来微分复合函数
$假设可微分函数 y 有变量 u 1, u 2, ..., u m ，其中每个可微分函数 u i 都有变量 x 1, x 2, ..., x n 。对于任意 i = 1, 2, ..., n ，链式法则给出$ ：
$\frac{\partial y}{\partial x_{i}}=\frac{\partial y}{\partial u_{1}}\frac{\partial u_{1}}{\partial x_{i}}+\frac{\partial y}{\partial u_{2}}\frac{\partial u_{2}}{\partial x_{i}}+...+\frac{\partial y}{\partial u_{m}}\frac{\partial u_{m}}{\partial x_{i}}$