2.4.1 导数和微分
导数是啥无需多讲,可以代码实现求 f ( x ) = 3 x 2 − 4 x f(x)=3x^2-4x f(x)=3x2−4x 在 x = 1 x=1 x=1 处的导数的趋近值
%matplotlib inline
import numpy as np
from matplotlib_inline import backend_inline
from d2l import torch as d2l
def f(x): # 定义f(X)
return 3 * x ** 2 - 4 * x
def numerical_lim(f, x, h): # 求导函数
return (f(x + h) - f(x)) / h
h = 0.1
for i in range(5):
print(f'h={h:.5f}, numerical limit={numerical_lim(f, 1, h):.5f}')
h *= 0.1
h=0.10000, numerical limit=2.30000
h=0.01000, numerical limit=2.03000
h=0.00100, numerical limit=2.00300
h=0.00010, numerical limit=2.00030
h=0.00001, numerical limit=2.00003
使用matplotlib对导数的这种解释进行可视化。
ps:#@save是d2l包的标记,用来把函数、类或者语句保存在d2l包中,以后无需定义即可调用。
def use_svg_display(): #@save
"""使用svg格式在Jupyter中显示绘图"""
backend_inline.set_matplotlib_formats('svg')
def set_figsize(figsize=(3.5, 2.5)): #@save
"""设置matplotlib的图表大小"""
use_svg_display()
d2l.plt.rcParams['figure.figsize'] = figsize # 这里可以直接使用d2l.plt是因为导入语句 from matplotlib import pyplot as plt已标记为保存到d2l包中
#@save
def set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend):
"""设置matplotlib的轴"""
axes.set_xlabel(xlabel)
axes.set_ylabel(ylabel)
axes.set_xscale(xscale)
axes.set_yscale(yscale)
axes.set_xlim(xlim)
axes.set_ylim(ylim)
if legend:
axes.legend(legend)
axes.grid()
#@save
def plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None,
ylim=None, xscale='linear', yscale='linear',
fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None):
"""绘制数据点"""
if legend is None:
legend = []
set_figsize(figsize)
axes = axes if axes else d2l.plt.gca()
# 如果X有一个轴,输出True
def has_one_axis(X):
return (hasattr(X, "ndim") and X.ndim == 1 or isinstance(X, list)
and not hasattr(X[0], "__len__"))
if has_one_axis(X):
X = [X]
if Y is None:
X, Y = [[]] * len(X), X
elif has_one_axis(Y):
Y = [Y]
if len(X) != len(Y):
X = X * len(Y)
axes.cla()
for x, y, fmt in zip(X, Y, fmts):
if len(x):
axes.plot(x, y, fmt)
else:
axes.plot(y, fmt)
set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)
x = np.arange(0, 3, 0.1)
plot(x, [f(x), 2 * x - 3], 'x', 'f(x)', legend=['f(x)', 'Tangent line (x=1)'])
2.4.2 偏导数
无需多言
2.4.3 梯度
梯度(gradient)向量也就是一个包含n个偏导数的向量:
∇
x
f
(
x
)
=
[
∂
f
(
x
)
∂
x
1
,
∂
f
(
x
)
∂
x
2
,
…
,
∂
f
(
x
)
∂
x
3
]
T
\nabla_xf(x)=\left[\frac{\partial f(x)}{\partial x_1},\frac{\partial f(x)}{\partial x_2},\dots,\frac{\partial f(x)}{\partial x_3}\right]^T
∇xf(x)=[∂x1∂f(x),∂x2∂f(x),…,∂x3∂f(x)]T
2.4.4 链式法则
无需多言
练习
(1)绘制函数 y = f ( x ) = x 3 − 1 x y=f(x)=x^3-\frac{1}{x} y=f(x)=x3−x1 和其在 x = 1 x=1 x=1 处切线的图像。
def g(x): # 定义g(X)
return x ** 3 - 1 / x
x = np.arange(0, 3, 0.1)
plot(x, [g(x), 4 * x - 4], 'x', 'g(x)', legend=['g(x)', 'Tangent line (x=1)'])
C:\Users\AncilunKiang\AppData\Local\Temp\ipykernel_8840\2032550329.py:2: RuntimeWarning: divide by zero encountered in true_divide
return x ** 3 - 1 / x
(2)求函数 f ( x ) = 3 x 1 2 + 5 e x 2 f(x)=3x_1^2+5e^{x_2} f(x)=3x12+5ex2的梯度值
求两个偏导,写成向量形式即可。
对 x 1 x_1 x1 求偏导得: f ′ ( x 1 ) = 6 x 1 f'(x_1)=6x_1 f′(x1)=6x1
对 x 2 x_2 x2 求偏导得: f ′ ( x 1 ) = 5 e x 2 f'(x_1)=5e^{x_2} f′(x1)=5ex2
最后得函数
f
(
x
)
f(x)
f(x) 相对于
x
x
x 的梯度是:
∇
x
f
(
x
)
=
[
6
x
1
,
5
e
x
2
]
T
\nabla_xf(x)=\left[6x_1\ ,\ 5e^{x_2}\right]^T
∇xf(x)=[6x1 , 5ex2]T
(3)函数 f ( x ) = ∣ ∣ x ∣ ∣ 2 f(x)=||x||_2 f(x)=∣∣x∣∣2 的梯度是什么?
第二范数的表达式时:
f
(
x
)
=
∣
∣
x
∣
∣
2
=
∑
i
=
1
n
x
i
2
=
x
1
2
+
x
2
2
+
⋯
+
x
n
2
f(x)=||x||_2=\sqrt{\sum^n_{i=1}x^2_i}=\sqrt{x_1^2+x_2^2+\dots+x_n^2}
f(x)=∣∣x∣∣2=i=1∑nxi2=x12+x22+⋯+xn2
求偏导是:
∂
f
(
x
)
∂
x
k
=
x
k
∑
i
=
1
n
x
i
2
\frac{\partial f(x)}{\partial x_k}=\frac{x_k}{\sqrt{\sum^n_{i=1}x^2_i}}
∂xk∂f(x)=∑i=1nxi2xk
最后得函数
f
(
x
)
f(x)
f(x) 相对于
x
x
x 的梯度是:
∇
x
f
(
x
)
=
[
x
1
∑
i
=
1
n
x
i
2
,
x
2
∑
i
=
1
n
x
i
2
,
…
,
x
n
∑
i
=
1
n
x
i
2
]
T
\nabla_xf(x)=\left[\frac{x_1}{\sqrt{\sum^n_{i=1}x^2_i}},\frac{x_2}{\sqrt{\sum^n_{i=1}x^2_i}},\dots,\frac{x_n}{\sqrt{\sum^n_{i=1}x^2_i}}\right]^T
∇xf(x)=[∑i=1nxi2x1,∑i=1nxi2x2,…,∑i=1nxi2xn]T
(4)尝试写出函数 u = f ( x , y , z ) u=f(x,y,z) u=f(x,y,z) 的链式法则,其中 x = x ( a , b ) x=x(a,b) x=x(a,b), y = y ( a , b ) y=y(a,b) y=y(a,b), z = z ( a , b ) z=z(a,b) z=z(a,b)。
∂ u ∂ a = ∂ f ∂ x ∂ x ∂ a + ∂ f ∂ y ∂ y ∂ a + ∂ f ∂ z ∂ z ∂ a \frac{\partial u}{\partial a}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial a}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial a}+\frac{\partial f}{\partial z}\frac{\partial z}{\partial a} ∂a∂u=∂x∂f∂a∂x+∂y∂f∂a∂y+∂z∂f∂a∂z
∂ u ∂ b = ∂ f ∂ x ∂ x ∂ b + ∂ f ∂ y ∂ y ∂ b + ∂ f ∂ z ∂ z ∂ b \frac{\partial u}{\partial b}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial b}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial b}+\frac{\partial f}{\partial z}\frac{\partial z}{\partial b} ∂b∂u=∂x∂f∂b∂x+∂y∂f∂b∂y+∂z∂f∂b∂z