807补充(四)(高斯积分篇)
一.Jacobian矩阵(雅可比矩阵)
Jacobian矩阵可被视为是一种组织梯度向量的方法。在前三篇中,我们给出了梯度的定义与计算公式,借助梯度可以简单得出雅可比矩阵的定义。
D
x
f
=
def
(
∇
x
f
)
T
\text{D}_{\mathbf x}\mathbf f\overset{\text{def}}{=}(\nabla_{\mathbf x} \mathbf f)^T
Dxf=def(∇xf)T
易看出当雅可比矩阵为方阵时
f
(
x
)
与
\mathbf f(\mathbf x)与
f(x)与
x
\mathbf x
x是同维度向量,若雅可比矩阵不为方阵,则从
f
(
x
)
\mathbf f(\mathbf x)
f(x)到
x
\mathbf x
x的映射是降维映射或升维映射。
在微分几何中雅可比矩阵可以衡量两个函数之间的变换是否光滑。
二.Jacobian行列式
当雅可比矩阵为方阵时,雅可比矩阵的行列式可用于重积分换元,在二维情况下,有以下二重积分换元成立
∬
D1
f
(
x
,
y
)
dxdy
=
∬
D2
f
(
g
(
u
,
v
)
,
h
(
u
,
v
)
)
∣
J
∣
dudv
x
=
g
(
u
,
v
)
y
=
h
(
u
,
v
)
\begin{aligned} \iint_{\text{D1}}f(x,y)\text{dxdy}&=\iint _\text{D2}f(g(u,v),h(u,v))|J|\text{dudv}\\ x&=g(u,v) \\ y&=h(u,v) \end{aligned}
∬D1f(x,y)dxdyxy=∬D2f(g(u,v),h(u,v))∣J∣dudv=g(u,v)=h(u,v)
其中
J
=
∣
∂
x
∂
y
∂
u
∂
v
∣
=
det
(
[
∂
x
∂
u
∂
x
∂
v
∂
y
∂
u
∂
y
∂
v
]
)
\begin{aligned} J&=|\frac{\partial x\partial y}{\partial u\partial v}|\\ &=\det(\begin{bmatrix}\frac{\partial x}{\partial u}&\frac{\partial x}{\partial v}\\ \frac{\partial y}{\partial u}&\frac{\partial y}{\partial v} \end{bmatrix}) \end{aligned}
J=∣∂u∂v∂x∂y∣=det([∂u∂x∂u∂y∂v∂x∂v∂y])
即雅可比矩阵行列式,
D1,D2
\text{D1,D2}
D1,D2是变换前后的区域。
推广至n维空间也成立。
三.高斯积分
“物理学家只学会了求解高斯积分”
- I = ∫ − ∞ ∞ e − x 2 2 dx I=\int_{-\infty}^{\infty}e^{-\frac{x^2}{2}}\text{dx} I=∫−∞∞e−2x2dx
I = I 2 = ∫ − ∞ ∞ e − x 2 2 dx ∫ − ∞ ∞ e − y 2 2 dy = ∫ − ∞ ∞ ∫ − ∞ ∞ e − 1 / 2 ∗ ( x 2 + y 2 ) dxdy = ∫ 0 2 π d θ ∫ 0 ∞ e − 1 / 2 ∗ r 2 r dr = 2 π \begin{aligned} I&=\sqrt{I^2}\\ &=\sqrt{\int_{-\infty}^{\infty}e^{-\frac{x^2}{2}}\text{dx}\int_{-\infty}^{\infty}e^{-\frac{y^2}{2}}\text{dy}}\\ &=\sqrt{\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}e^{-1/2*(x^2+y^2)}\text{dxdy}}\\ &=\sqrt{\int_{0}^{2\pi}\text{d}\theta\int_{0}^{\infty}e^{-1/2*r^2}r\text{dr}}\\ &=\sqrt{2\pi} \end{aligned} I=I2=∫−∞∞e−2x2dx∫−∞∞e−2y2dy=∫−∞∞∫−∞∞e−1/2∗(x2+y2)dxdy=∫02πdθ∫0∞e−1/2∗r2rdr=2π
2.
I
=
∫
−
∞
∞
e
−
a
x
2
2
+
j
x
dx
I=\int_{-\infty}^{\infty}e^{-\frac{ax^2}{2}+jx}\text{dx}
I=∫−∞∞e−2ax2+jxdx
I
=
∫
−
∞
∞
e
−
a
x
2
2
+
j
x
dx
=
∫
−
∞
∞
e
−
1
2
a
(
x
2
−
2
j
a
x
+
j
2
a
2
)
+
j
2
2
a
2
dx
=
e
j
2
2
a
2
∫
−
∞
∞
e
−
1
2
a
(
x
−
j
a
)
2
d
x
=
e
j
2
2
a
2
2
π
a
\begin{aligned} I&=\int_{-\infty}^{\infty}e^{-\frac{ax^2}{2}+jx}\text{dx}\\ &=\int_{-\infty}^{\infty}e^{-\frac{1}{2}a(x^2-\frac{2j}{a}x+\frac{j^2}{a^2})+\frac{j^2}{2a^2}}\text{dx}\\ &=e^{\frac{j^2}{2a^2}}\int_{-\infty}^{\infty}e^{-\frac{1}{2}a(x-\frac{j}{a})^2}\text{d}x \\ &=e^{\frac{j^2}{2a^2}}\sqrt{\frac{2\pi}{a}} \end{aligned}
I=∫−∞∞e−2ax2+jxdx=∫−∞∞e−21a(x2−a2jx+a2j2)+2a2j2dx=e2a2j2∫−∞∞e−21a(x−aj)2dx=e2a2j2a2π
3.n重高斯积分: I = ∫ − ∞ ∞ e − 1 2 ∑ i = 1 n x i 2 d x 1 d x 2 ⋯ d x n = ∫ − ∞ ∞ e − 1 2 x T x d x I=\int_{-\infty}^{\infty}e^{-\frac{1}{2}\sum_{i=1}^{n}x_i^2}\text{d}x_1\text{d}x_2\cdots\text{d}x_n=\int_{-\infty}^{\infty}e^{-\frac{1}{2}\mathbf x^T\mathbf x}\text{d}\mathbf x I=∫−∞∞e−21∑i=1nxi2dx1dx2⋯dxn=∫−∞∞e−21xTxdx
由1易知 I = ( 2 π ) n 2 I=(2\pi)^{\frac{n}{2}} I=(2π)2n
4.二次型任意: I = ∫ − ∞ ∞ e − 1 2 x T K x d x I=\int_{-\infty}^{\infty}e^{-\frac{1}{2}\mathbf x^T\mathbf K\mathbf x}\text{d}\mathbf x I=∫−∞∞e−21xTKxdx, K \mathbf K K为正定矩阵。
做分解
K
=
S
T
S
\mathbf K=\mathbf S^T\mathbf S
K=STS,使得
−
1
2
x
T
K
x
=
−
1
2
(
S
x
)
T
(
S
x
)
-\frac{1}{2}\mathbf x^T\mathbf K\mathbf x=-\frac{1}{2}(\mathbf S\mathbf x)^T(\mathbf S \mathbf x)
−21xTKx=−21(Sx)T(Sx),令
y
=
S
x
\mathbf y=\mathbf S\mathbf x
y=Sx
I
=
∫
−
∞
∞
e
−
1
2
x
T
K
x
d
x
=
∫
−
∞
∞
e
−
1
2
(
S
x
)
T
(
S
x
)
d
x
=
∫
−
∞
∞
e
−
1
2
y
T
y
∣
∂
x
∂
y
∣
d
y
=
∣
∂
x
∂
y
∣
∗
(
2
π
)
n
2
=
∣
S
∣
−
1
∗
(
2
π
)
n
2
=
∣
K
∣
−
1
/
2
∗
(
2
π
)
n
2
=
(
2
π
)
n
det
K
\begin{aligned} I&=\int_{-\infty}^{\infty}e^{-\frac{1}{2}\mathbf x^T\mathbf K\mathbf x}\text{d}\mathbf x\\ &=\int_{-\infty}^{\infty}e^{-\frac{1}{2}(\mathbf S\mathbf x)^T(\mathbf S \mathbf x)}\text{d}\mathbf x\\ &=\int_{-\infty}^{\infty}e^{-\frac{1}{2}\mathbf y^T\mathbf y}|\frac{\partial \mathbf x}{\partial \mathbf y}|\text{d}\mathbf y\\ &=|\frac{\partial \mathbf x}{\partial \mathbf y}|*(2\pi)^{\frac{n}{2}}\\ &=|\mathbf S|^{-1}*(2\pi)^{\frac{n}{2}}\\ &=|\mathbf K|^{-1/2}*(2\pi)^{\frac{n}{2}}\\ &=\sqrt{\frac{(2\pi)^n}{\det\mathbf K}} \end{aligned}
I=∫−∞∞e−21xTKxdx=∫−∞∞e−21(Sx)T(Sx)dx=∫−∞∞e−21yTy∣∂y∂x∣dy=∣∂y∂x∣∗(2π)2n=∣S∣−1∗(2π)2n=∣K∣−1/2∗(2π)2n=detK(2π)n
5.
I
=
∫
−
∞
∞
e
−
1
2
x
T
K
x
+
b
T
x
d
x
I=\int_{-\infty}^{\infty}e^{-\frac{1}{2}\mathbf x^T\mathbf K\mathbf x+\mathbf b^T\mathbf x}\text{d}\mathbf x
I=∫−∞∞e−21xTKx+bTxdx
−
1
2
x
T
K
x
+
b
T
x
=
−
1
2
(
x
−
K
−
1
b
)
T
K
(
x
−
K
−
1
b
)
+
1
2
b
T
K
−
1
b
-\frac{1}{2} \mathbf{x}^T \mathbf{K} \mathbf{x}+\mathbf{b}^T \mathbf{x}=-\frac{1}{2}\left(\mathbf{x}-\mathbf{K}^{-1} \mathbf{b}\right)^T \mathbf{K}\left(\mathbf{x}-\mathbf{K}^{-1} \mathbf{b}\right)+\frac{1}{2} \mathbf{b}^T \mathbf{K}^{-1} \mathbf{b}
−21xTKx+bTx=−21(x−K−1b)TK(x−K−1b)+21bTK−1b
可知 I = ( 2 π ) n det K e 1 2 b T K − 1 b I=\sqrt{\frac{(2\pi)^n}{\det\mathbf K}}e^{\frac{1}{2}\mathbf b^T\mathbf K^{-1}\mathbf b} I=detK(2π)ne21bTK−1b
四.例
(1)n维高斯分布的表达式为
p
(
x
)
=
1
(
2
π
)
n
2
1
det
∣
Σ
∣
e
−
1
2
(
x
−
μ
)
T
Σ
−
1
(
x
−
μ
)
p(\mathbf x)=\frac{1}{(2\pi)^{\frac{n}{2}}}\frac{1}{\sqrt{\det|\Sigma|}}e^{-\frac{1}{2}(\mathbf x-\mu)^T\Sigma^{-1}(\mathbf x-\mu)}
p(x)=(2π)2n1det∣Σ∣1e−21(x−μ)TΣ−1(x−μ)
∫
−
∞
∞
p
(
x
)
d
x
=
1
(
2
π
)
n
2
1
det
∣
Σ
∣
∫
−
∞
∞
e
−
1
2
(
x
−
μ
)
T
Σ
−
1
(
x
−
μ
)
d
x
=
1
(
2
π
)
n
2
1
det
∣
Σ
∣
∗
∫
−
∞
∞
e
−
1
2
x
T
Σ
−
1
x
d
x
=
1
\begin{aligned} \int_{-\infty}^{\infty} p(\mathbf x)\text{d}\mathbf x&=\frac{1}{(2\pi)^{\frac{n}{2}}}\frac{1}{\sqrt{\det|\Sigma|}}\int_{-\infty}^{\infty}e^{-\frac{1}{2}(\mathbf x-\mu)^T\Sigma^{-1}(\mathbf x-\mu)}\text{d}\mathbf x\\ &=\frac{1}{(2\pi)^{\frac{n}{2}}}\frac{1}{\sqrt{\det|\Sigma|}}*\int_{-\infty}^{\infty}e^{-\frac{1}{2}\mathbf x^T\Sigma^{-1}\mathbf x}\text{d}\mathbf x\\ &=1 \end{aligned}
∫−∞∞p(x)dx=(2π)2n1det∣Σ∣1∫−∞∞e−21(x−μ)TΣ−1(x−μ)dx=(2π)2n1det∣Σ∣1∗∫−∞∞e−21xTΣ−1xdx=1
(2)巴氏距离(Bhattacharyya distance),其定义为
B
D
(
p
(
x
)
,
q
(
x
)
)
=
−
log
∫
p
(
x
)
,
q
(
x
)
d
x
BD(p(\mathbf x),q(\mathbf x))=-\log \int\sqrt{p(\mathbf x),q(\mathbf x)}\text{d}\mathbf x
BD(p(x),q(x))=−log∫p(x),q(x)dx
对于两个正态分布来说,它们的巴氏距离是以下积分的负对数
∫
p
(
x
)
q
(
x
)
d
x
=
1
(
2
π
)
2
n
det
(
Σ
p
Σ
q
)
4
×
∫
exp
{
−
1
4
(
x
−
μ
p
)
⊤
Σ
p
−
1
(
x
−
μ
p
)
−
1
4
(
x
−
μ
q
)
⊤
Σ
q
−
1
(
x
−
μ
q
)
}
d
x
\begin{aligned} \int \sqrt{p(\boldsymbol{x}) q(\boldsymbol{x})} d \boldsymbol{x}&=\frac{1}{\sqrt[4]{(2 \pi)^{2 n} \operatorname{det}\left(\boldsymbol{\Sigma}_p \boldsymbol{\Sigma}_q\right)}} \times \int \exp \left\{-\frac{1}{4}\left(\boldsymbol{x}-\boldsymbol{\mu}_p\right)^{\top} \boldsymbol{\Sigma}_p^{-1}\left(\boldsymbol{x}-\boldsymbol{\mu}_p\right)-\frac{1}{4}\left(\boldsymbol{x}-\boldsymbol{\mu}_q\right)^{\top} \boldsymbol{\Sigma}_q^{-1}\left(\boldsymbol{x}-\boldsymbol{\mu}_q\right)\right\} d \boldsymbol{x} \end{aligned}
∫p(x)q(x)dx=4(2π)2ndet(ΣpΣq)1×∫exp{−41(x−μp)⊤Σp−1(x−μp)−41(x−μq)⊤Σq−1(x−μq)}dx
记
y
=
x
−
μ
p
,
Δ
=
μ
p
−
μ
q
\boldsymbol{y}=\boldsymbol{x}-\boldsymbol{\mu}_p, \boldsymbol{\Delta}=\boldsymbol{\mu}_p-\boldsymbol{\mu}_q
y=x−μp,Δ=μp−μq,换元可得
∫
exp
{
−
1
4
y
⊤
Σ
p
−
1
y
−
1
4
(
y
+
Δ
)
⊤
Σ
q
−
1
(
y
+
Δ
)
}
d
y
=
∫
exp
{
−
1
4
y
⊤
(
Σ
p
−
1
+
Σ
q
−
1
)
y
−
1
2
Δ
⊤
Σ
q
−
1
y
−
1
4
Δ
⊤
Σ
q
−
1
Δ
}
d
y
=
∫
exp
{
−
1
2
y
⊤
(
Σ
p
−
1
Σ
Σ
q
−
1
)
y
−
1
2
Δ
⊤
Σ
q
−
1
y
−
1
4
Δ
⊤
Σ
q
−
1
Δ
}
d
y
\begin{aligned} & \int \exp \left\{-\frac{1}{4} \boldsymbol{y}^{\top} \boldsymbol{\Sigma}_p^{-1} \boldsymbol{y}-\frac{1}{4}(\boldsymbol{y}+\boldsymbol{\Delta})^{\top} \boldsymbol{\Sigma}_q^{-1}(\boldsymbol{y}+\boldsymbol{\Delta})\right\} d \boldsymbol{y} \\ = & \int \exp \left\{-\frac{1}{4} \boldsymbol{y}^{\top}\left(\boldsymbol{\Sigma}_p^{-1}+\boldsymbol{\Sigma}_q^{-1}\right) \boldsymbol{y}-\frac{1}{2} \boldsymbol{\Delta}^{\top} \boldsymbol{\Sigma}_q^{-1} \boldsymbol{y}-\frac{1}{4} \boldsymbol{\Delta}^{\top} \boldsymbol{\Sigma}_q^{-1} \boldsymbol{\Delta}\right\} d \boldsymbol{y} \\ = & \int \exp \left\{-\frac{1}{2} \boldsymbol{y}^{\top}\left(\boldsymbol{\Sigma}_p^{-1} \boldsymbol{\Sigma} \boldsymbol{\Sigma}_q^{-1}\right) \boldsymbol{y}-\frac{1}{2} \boldsymbol{\Delta}^{\top} \boldsymbol{\Sigma}_q^{-1} \boldsymbol{y}-\frac{1}{4} \boldsymbol{\Delta}^{\top} \boldsymbol{\Sigma}_q^{-1} \boldsymbol{\Delta}\right\} d \boldsymbol{y} \end{aligned}
==∫exp{−41y⊤Σp−1y−41(y+Δ)⊤Σq−1(y+Δ)}dy∫exp{−41y⊤(Σp−1+Σq−1)y−21Δ⊤Σq−1y−41Δ⊤Σq−1Δ}dy∫exp{−21y⊤(Σp−1ΣΣq−1)y−21Δ⊤Σq−1y−41Δ⊤Σq−1Δ}dy
其中
Σ
=
1
2
(
Σ
p
+
Σ
q
)
\boldsymbol{\Sigma}=\frac{1}{2}\left(\boldsymbol{\Sigma}_p+\boldsymbol{\Sigma}_q\right)
Σ=21(Σp+Σq)
最后积分的结果为 ( 2 π ) n det ( Σ q Σ − 1 Σ p ) exp { − 1 8 Δ ⊤ Σ − 1 Δ } \sqrt{(2 \pi)^n \operatorname{det}\left(\boldsymbol{\Sigma}_q \boldsymbol{\Sigma}^{-1} \boldsymbol{\Sigma}_p\right)} \exp \left\{-\frac{1}{8} \boldsymbol{\Delta}^{\top} \boldsymbol{\Sigma}^{-1} \boldsymbol{\Delta}\right\} (2π)ndet(ΣqΣ−1Σp)exp{−81Δ⊤Σ−1Δ}
也可直接利用如下公式计算
−
1
2
(
x
−
m
1
)
T
Σ
1
−
1
(
x
−
m
1
)
−
1
2
(
x
−
m
2
)
T
Σ
2
−
1
(
x
−
m
2
)
=
−
1
2
(
x
−
m
c
)
T
Σ
c
−
1
(
x
−
m
c
)
+
C
Σ
c
−
1
=
Σ
1
−
1
+
Σ
2
−
1
m
c
=
(
Σ
1
−
1
+
Σ
2
−
1
)
−
1
(
Σ
1
−
1
m
1
+
Σ
2
−
1
m
2
)
C
=
1
2
(
m
1
T
Σ
1
−
1
+
m
2
T
Σ
2
−
1
)
(
Σ
1
−
1
+
Σ
2
−
1
)
−
1
(
Σ
1
−
1
m
1
+
Σ
2
−
1
m
2
)
−
1
2
(
m
1
T
Σ
1
−
1
m
1
+
m
2
T
Σ
2
−
1
m
2
)
\begin{aligned} & -\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_1\right)^T \boldsymbol{\Sigma}_1^{-1}\left(\mathbf{x}-\mathbf{m}_1\right) -\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_2\right)^T \mathbf{\Sigma}_2^{-1}\left(\mathbf{x}-\mathbf{m}_2\right) \\ & =-\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_c\right)^T \boldsymbol{\Sigma}_c^{-1}\left(\mathbf{x}-\mathbf{m}_c\right)+C \\ & \boldsymbol{\Sigma}_c^{-1}=\boldsymbol{\Sigma}_1^{-1}+\boldsymbol{\Sigma}_2^{-1} \\ & \mathbf{m}_c=\left(\boldsymbol{\Sigma}_1^{-1}+\boldsymbol{\Sigma}_2^{-1}\right)^{-1}\left(\boldsymbol{\Sigma}_1^{-1} \mathbf{m}_1+\boldsymbol{\Sigma}_2^{-1} \mathbf{m}_2\right) \\ & C=\frac{1}{2}\left(\mathbf{m}_1^T \boldsymbol{\Sigma}_1^{-1}+\mathbf{m}_2^T \boldsymbol{\Sigma}_2^{-1}\right)\left(\boldsymbol{\Sigma}_1^{-1}+\boldsymbol{\Sigma}_2^{-1}\right)^{-1}\left(\boldsymbol{\Sigma}_1^{-1} \mathbf{m}_1+\boldsymbol{\Sigma}_2^{-1} \mathbf{m}_2\right) \\ & -\frac{1}{2}\left(\mathbf{m}_1^T \boldsymbol{\Sigma}_1^{-1} \mathbf{m}_1+\mathbf{m}_2^T \boldsymbol{\Sigma}_2^{-1} \mathbf{m}_2\right) \\ & \end{aligned}
−21(x−m1)TΣ1−1(x−m1)−21(x−m2)TΣ2−1(x−m2)=−21(x−mc)TΣc−1(x−mc)+CΣc−1=Σ1−1+Σ2−1mc=(Σ1−1+Σ2−1)−1(Σ1−1m1+Σ2−1m2)C=21(m1TΣ1−1+m2TΣ2−1)(Σ1−1+Σ2−1)−1(Σ1−1m1+Σ2−1m2)−21(m1TΣ1−1m1+m2TΣ2−1m2)
所以最终
B
D
(
p
(
x
)
,
q
(
x
)
)
=
−
log
(
2
π
)
n
det
(
Σ
q
Σ
−
1
Σ
p
)
(
2
π
)
2
n
det
(
Σ
p
Σ
q
)
4
exp
{
−
1
8
Δ
⊤
Σ
−
1
Δ
}
=
−
log
det
(
Σ
p
Σ
q
)
4
det
(
Σ
)
exp
{
−
1
8
Δ
⊤
Σ
−
1
Δ
}
=
1
2
log
det
(
Σ
)
det
(
Σ
p
Σ
q
)
+
1
8
(
μ
p
−
μ
q
)
⊤
Σ
−
1
(
μ
p
−
μ
q
)
\begin{aligned} B D(p(\boldsymbol{x}), q(\boldsymbol{x})) & =-\log \frac{\sqrt{(2 \pi)^n \operatorname{det}\left(\boldsymbol{\Sigma}_q \boldsymbol{\Sigma}^{-1} \boldsymbol{\Sigma}_p\right)}}{\sqrt[4]{(2 \pi)^{2 n} \operatorname{det}\left(\boldsymbol{\Sigma}_p \boldsymbol{\Sigma}_q\right)}} \exp \left\{-\frac{1}{8} \boldsymbol{\Delta}^{\top} \boldsymbol{\Sigma}^{-1} \boldsymbol{\Delta}\right\} \\ & =-\log \frac{\sqrt[4]{\operatorname{det}\left(\boldsymbol{\Sigma}_p \boldsymbol{\Sigma}_q\right)}}{\sqrt{\operatorname{det}(\boldsymbol{\Sigma})}} \exp \left\{-\frac{1}{8} \boldsymbol{\Delta}^{\top} \boldsymbol{\Sigma}^{-1} \boldsymbol{\Delta}\right\} \\ & =\frac{1}{2} \log \frac{\operatorname{det}(\boldsymbol{\Sigma})}{\sqrt{\operatorname{det}\left(\boldsymbol{\Sigma}_p \boldsymbol{\Sigma}_q\right)}}+\frac{1}{8}\left(\boldsymbol{\mu}_p-\boldsymbol{\mu}_q\right)^{\top} \boldsymbol{\Sigma}^{-1}\left(\boldsymbol{\mu}_p-\boldsymbol{\mu}_q\right) \end{aligned}
BD(p(x),q(x))=−log4(2π)2ndet(ΣpΣq)(2π)ndet(ΣqΣ−1Σp)exp{−81Δ⊤Σ−1Δ}=−logdet(Σ)4det(ΣpΣq)exp{−81Δ⊤Σ−1Δ}=21logdet(ΣpΣq)det(Σ)+81(μp−μq)⊤Σ−1(μp−μq)