张学工模式识别第四版——02 统计决策方法

第2章 统计决策方法

2.1 引言:一个简单的例子

对二类问题,在样本 x x x上错误的概率为:

p ( e ∣ x ) = { P ( w 2 ∣ x ) x ∈ w 1 P ( w 1 ∣ x ) x ∈ w 2 p(e \mid x) = \left\{\begin{aligned}P(w_2 \mid x) \quad x \in w_1 \\P(w_1 \mid x) \quad x \in w_2\end{aligned}\right. p(ex)={P(w2x)xw1P(w1x)xw2

平均错误率:

P ( e ) = ∫ P ( e ∣ x ) p ( x ) d x P(e)= \int P(e \mid x)p(x)dx P(e)=P(ex)p(x)dx

2.2 最小错误率贝叶斯决策

贝叶斯公式:

P ( w i ∣ x ) = p ( x , w i ) p ( x ) = p ( x ∣ w i ) ⋅ P ( w i ) ∑ j = 1 2 p ( x ∣ w j ) P ( w j ) , i = 1 , 2 P\left(w_{i} \mid x\right)=\frac{p\left(x, w_{i}\right)}{p(x)}=\frac{p\left(x \mid w_{i}\right) \cdot P\left(w_{i}\right)}{\sum\limits_{j=1}^2p(x \mid w_j)P(w_j)}, i=1,2 P(wix)=p(x)p(x,wi)=j=12p(xwj)P(wj)p(xwi)P(wi),i=1,2

二类最小错误率贝叶斯决策:

i f   P ( w 1 ∣ x ) ≷ P ( w 2 ∣ x ) ,   t h e n   x ∈ { w 1 w 2 if \ P(w_1 \mid x) \gtrless P(w_2 \mid x),\ then \ x \in \left\{\begin{aligned}w_1\\w_2\end{aligned}\right. if P(w1x)P(w2x), then x{w1w2

i f   p ( x ∣ w 1 ) P ( w 1 ) ≷ p ( x ∣ w 2 ) P ( w 2 ) ,   t h e n   x ∈ { w 1 w 2 if \ p(x \mid w_1)P(w_1)\gtrless p(x \mid w_2)P(w_2),\ then \ x \in \left\{\begin{aligned}w_1\\w_2\end{aligned}\right. if p(xw1)P(w1)p(xw2)P(w2), then x{w1w2

i f   l ( x ) = p ( x ∣ w 1 ) p ( x ∣ w 2 ) ≷ λ = P ( w 2 ) P ( w 1 ) ,   t h e n   x ∈ { w 1 w 2 if \ l(x)=\frac{p\left(x \mid w_{1}\right)}{p\left(x \mid w_{2}\right)} \gtrless \lambda=\frac{P\left(w_{2}\right)}{P\left(w_{1}\right)}, \ then \ x \in \left\{\begin{aligned}w_1\\w_2\end{aligned}\right. if l(x)=p(xw2)p(xw1)λ=P(w1)P(w2), then x{w1w2

i f   h ( x ) = ln ⁡ p ( x ∣ w 2 ) p ( x ∣ w 1 ) ≶ ln ⁡ P ( w 1 ) P ( w 2 ) ,   t h e n   x ∈ { w 1 w 2 if \ h(x)=\ln \frac{p\left(x \mid w_{2}\right)}{p\left(x \mid w_{1}\right)}\lessgtr \ln \frac{P(w_1)}{P(w_2)} , \ then \ x \in \left\{\begin{aligned}w_1\\w_2\end{aligned}\right. if h(x)=lnp(xw1)p(xw2)lnP(w2)P(w1), then x{w1w2

其中 h ( x ) = − ln ⁡ ( l ( x ) ) h(x)=-\ln(l(x)) h(x)=ln(l(x))

错误率的进一步定义:

P ( e ) = P ( w 2 ) ∫ R 1 p ( x ∣ w 2 ) d x + P ( w 1 ) ∫ R 2 p ( x ∣ w 1 ) d x = P ( w 2 ) P 2 ( e ) + P ( w 2 ) P 1 ( e ) P(e)=P(w_2)\int_{R_1}p(x \mid w_2)dx+P(w_1)\int_{R_2}p(x \mid w_1)dx=P(w_2)P_2(e)+P(w_2)P_1(e) P(e)=P(w2)R1p(xw2)dx+P(w1)R2p(xw1)dx=P(w2)P2(e)+P(w2)P1(e)

多类情况下的最小贝叶斯决策规则:

i f   p ( x ∣ w i ) ⋅ P ( w i ) = max ⁡ j = 1 , 2 , ⋯   , c p ( x ∣ w j ) ⋅ P ( w j ) ,   t h e n   x ∈ w i if \ p\left(x \mid w_{i}\right) \cdot P\left(w_{i}\right)=\max _{j=1,2, \cdots, c} p\left(x \mid w_{j}\right) \cdot P\left(w_{j}\right), \ then \ x \in w_i if p(xwi)P(wi)=j=1,2,,cmaxp(xwj)P(wj), then xwi

多类别决策错误率计算:

KaTeX parse error: Undefined control sequence: \substack at position 28: …i=1}^{c} \sum_{\̲s̲u̲b̲s̲t̲a̲c̲k̲{j=1 \\ j \neq …

P ( e ) = 1 − P ( c ) = 1 − ∑ j = 1 c P ( w j ) ⋅ ∫ R j p ( x ∣ w j ) ⋅ d x P(e)=1-P(c)=1-\sum_{j=1}^{c} P\left(w_{j}\right) \cdot \int_{\mathscr{R_{j}}} p\left(x \mid w_{j}\right) \cdot dx P(e)=1P(c)=1j=1cP(wj)Rjp(xwj)dx
多类别决策平均错误率推导

2.3 最小风险贝叶斯决策

d维,c类,k种决策

△△△△最小风险贝叶斯决策△△△△

决策规则对特征空间所有可能样本x采取决策所造成的期望损失为:

R ( α ) = ∫ R ( α ∣ x ) p ( x ) d x R(\alpha)=\int R(\alpha\mid x) p(x) d x R(α)=R(αx)p(x)dx

贝叶斯公式计算后验概率:

P ( w j ∣ x ) = p ( x ∣ w j ) P ( w j ) ∑ i = 1 c p ( x ∣ w i ) P ( w i ) ,   j = 1 , 2 , ⋯   , c P(w_j\mid x)=\frac{p(x \mid w_j)P(w_j)}{\sum\limits_{i=1}^{c}p(x\mid w_i)P(w_i)}, \ j=1,2,\cdots, c P(wjx)=i=1cp(xwi)P(wi)p(xwj)P(wj), j=1,2,,c

对于某个样本x,对它采取决策alpha_i,i=1,2,……,k的期望损失(条件风险)为:

R ( α i ∣ x ) = E ( λ ( α i , w j ) ∣ x ) = ∑ j = 1 c λ ( α i , w j ) ⋅ P ( w j ∣ x ) , i = 1 , 2 , ⋯ k R\left(\alpha_{i} \mid x \right)=E\left(\lambda\left(\alpha_{i}, w_{j}\right) \mid x\right)=\sum_{j=1}^{c} \lambda\left(\alpha_{i}, w_{j}\right) \cdot P\left(w_{j} \mid x\right), i=1,2, \cdots k R(αix)=E(λ(αi,wj)x)=j=1cλ(αi,wj)P(wjx),i=1,2,k

多类的最小风险贝叶斯决策:

i f   R ( α i ∣ x ) = min ⁡ j = 1 , 2 , ⋯ k R ( α j ∣ x ) ,   t h e n   α = α i if \ R\left(\alpha_{i} \mid x\right)=\min_{j=1,2, \cdots k} R\left(\alpha_{j} \mid x\right), \ then \ \alpha=\alpha_{i} if R(αix)=j=1,2,kminR(αjx), then α=αi

二类二决策情况下,最小风险贝叶斯决策:

i f   λ 11 P ( ω 1 ∣ x ) + λ 12 P ( ω 2 ∣ x ) ≶ λ 21 P ( ω 1 ∣ x ) + λ 22 P ( ω 2 ∣ x ) , then    x ∈ { ω 1 ω 2 if \ \lambda_{11} P\left(\omega_{1} \mid x\right)+\lambda_{12} P\left(\omega_{2} \mid x\right) \lessgtr \lambda_{21} P\left(\omega_{1} \mid x\right)+\lambda_{22} P\left(\omega_{2} \mid x\right) \text {, then \ } x \in\left\{\begin{array}{l}\omega_{1} \\\omega_{2}\end{array}\right. if λ11P(ω1x)+λ12P(ω2x)λ21P(ω1x)+λ22P(ω2x), then   x{ω1ω2

i f   ( λ 11 − λ 21 ) P ( ω 1 ∣ x ) ≶ ( λ 22 − λ 12 ) P ( ω 2 ∣ x ) ,   t h e n   x ∈ { ω 1 ω 2 i f   P ( ω 1 ∣ x ) P ( ω 2 ∣ x ) = p ( x ∣ ω 1 ) P ( ω 1 ) p ( x ∣ ω 2 ) P ( ω 2 ) ≷ λ 12 − λ 22 λ 21 − λ 11 ,   t h e n   x ∈ { ω 1 ω 2 i f   l ( x ) = p ( x ∣ ω 1 ) p ( x ∣ ω 2 ) ≷ P ( ω 2 ) P ( ω 1 ) ⋅ λ 12 − λ 22 λ 21 − λ 11 ,   t h e n   x ∈ { ω 1 ω 2 \begin{aligned}&if \ \left(\lambda_{11}-\lambda_{21}\right) P\left(\omega_{1} \mid x\right) \lessgtr\left(\lambda_{22}-\lambda_{12}\right) P\left(\omega_{2} \mid x\right) ,\ then \ x \in\left\{\begin{array}{l}\omega_{1} \\\omega_{2}\end{array}\right. \\&if \ \frac{P\left(\omega_{1} \mid x\right)}{P\left(\omega_{2} \mid x\right)}=\frac{p\left(x \mid \omega_{1}\right) P\left(\omega_{1}\right)}{p\left(x \mid \omega_{2}\right) P\left(\omega_{2}\right)} \gtrless \frac{\lambda_{12}-\lambda_{22}}{\lambda_{21}-\lambda_{11}} ,\ then \ x \in\left\{\begin{array}{l}\omega_{1} \\\omega_{2}\end{array}\right. \\&if \ l(x)=\frac{p\left(x \mid \omega_{1}\right)}{p\left(x \mid \omega_{2}\right)} \gtrless \frac{P\left(\omega_{2}\right)}{P\left(\omega_{1}\right)} \cdot \frac{\lambda_{12}-\lambda_{22}}{\lambda_{21}-\lambda_{11}}, \ then \ x \in\left\{\begin{array}{l}\omega_{1} \\\omega_{2}\end{array}\right.\end{aligned} if (λ11λ21)P(ω1x)(λ22λ12)P(ω2x), then x{ω1ω2if P(ω2x)P(ω1x)=p(xω2)P(ω2)p(xω1)P(ω1)λ21λ11λ12λ22, then x{ω1ω2if l(x)=p(xω2)p(xω1)P(ω1)P(ω2)λ21λ11λ12λ22, then x{ω1ω2

2.4 两类错误率、Neyman-Pearson决策与ROC曲线

状态与决策的可能关系:

状态
决策阳性阴性
阳性真阳性(TP)假阳性(FP)
阴性假阴性(FN)真阴性(TN)

灵敏度(sensitivity): S n = T P T P + F N Sn=\frac{TP}{TP+FN} Sn=TP+FNTP

特异度(specificity): S p = T N T N + F P Sp=\frac{TN}{TN+FP} Sp=TN+FPTN

正确率(accuracy): A C C = T P + T N T P + T N + F P + F N ACC=\frac{TP+TN}{TP+TN+FP+FN} ACC=TP+TN+FP+FNTP+TN

召回率(recall): R e c = T P T P + F N Rec=\frac{TP}{TP+FN} Rec=TP+FNTP

精确率(precision): P r e = T P T P + F P Pre=\frac{TP}{TP+FP} Pre=TP+FPTP

F度量(F-measure): F = 2 R e c P r e R e c + P r e F=\frac{2RecPre}{Rec+Pre} F=Rec+Pre2RecPre

一类错误率(假阳性率): α = 1 − S p = F P T N + F P \alpha=1-Sp=\frac{FP}{TN+FP} α=1Sp=TN+FPFP

二类错误率(假阴性率): β = 1 − S n = F N T P + F N \beta=1-Sn=\frac{FN}{TP+FN} β=1Sn=TP+FNFN

Neyman-Pearson决策规则:

i f   l ( x ) = p ( x ∣ w 1 ) p ( x ∣ w 2 ) ≷ λ ,   t h e n   x ∈ { w 1 w 2 if \ l(x)=\frac{p\left(x \mid w_{1}\right)}{p(x\mid w_2)} \gtrless \lambda, \ then \ x \in \left\{\begin{aligned}w_1\\w_2\end{aligned}\right. if l(x)=p(xw2)p(xw1)λ, then x{w1w2

对于高斯分布或者部分简单分布 λ \lambda λ可以采用解析法求解,即 λ \lambda λ是使决策区域满足下式的一个阈值(固定 w 2 w_2 w2分为 w 1 w_1 w1的错误率):

∫ R 1 p ( x ∣ w 2 ) d x = ϵ 0 \int_{R_1}p(x \mid w_2)dx=\epsilon_0 R1p(xw2)dx=ϵ0

多数情况下 λ \lambda λ用数值方法求解:

P 2 ( e ) = 1 − ∫ 0 λ P ( l ∣ ω 2 ) d l = ε 0 ,  P_{2}(e)=1-\int_{0}^{\lambda} P\left(l \mid \omega_{2}\right) d l=\varepsilon_{0} \text {, } P2(e)=10λP(lω2)dl=ε0

其中 ε 0 \varepsilon_{0} ε0固定,P2(e)单调,试探即可

2.5 正态分布时的统计决策

2.5.1 正态分布及其性质回顾

多元正态分布公式:

p ( x ) = 1 ( 2 π ) d 2 ∣ Σ ∣ 1 2 e x p { − 1 2 ( x − μ ) ⊤ Σ − 1 ( x − μ ) } p(x)=\frac{1}{(2\pi)^{\frac{d}{2}}|\Sigma|^{\frac{1}{2}}} exp\left\{-\frac{1}{2}(x-\mu)^{\top} \Sigma^{-1}(x-\mu)\right\} p(x)=(2π)2dΣ211exp{21(xμ)Σ1(xμ)}

对于多元正态概率 p ( x ∣ w i ) ∼ N ( μ i , Σ i ) ,   i = 1 , 2 , ⋯   , c p(x \mid w_i)\sim N(\mu_i, \Sigma_i), \ i=1,2,\cdots, c p(xwi)N(μi,Σi), i=1,2,,c,可以得出其条件概率:

p ( x ∣ w i ) = 1 ( 2 π ) d 2 ∣ Σ i ∣ 1 2 e x p { − 1 2 ( x − μ i ) ⊤ Σ i − 1 ( x − μ i ) } p(x \mid w_i)=\frac{1}{(2\pi)^{\frac{d}{2}}|\Sigma_i|^{\frac{1}{2}}} exp\left\{-\frac{1}{2}(x-\mu_i)^{\top} \Sigma^{-1}_i(x-\mu_i)\right\} p(xwi)=(2π)2dΣi211exp{21(xμi)Σi1(xμi)}

其中对于离散样本有:

μ i = E ( x ) = 1 N i ∑ x j ∈ H i x j , i = 1 , 2 , ⋯   , c \mu_i = E(x)=\frac{1}{N_i}\sum\limits_{x_j\in H_i}x_j, \quad i=1,2, \cdots, c μi=E(x)=Ni1xjHixj,i=1,2,,c

Σ i = E [ ( x − μ i ) ( x − μ i ) T ] = 1 N i ∑ x j ∈ H i ( x j − μ i ) ( x j − μ i ) T , i = 1 , 2 , ⋯   , c \Sigma_i = E[(x-\mu_i)(x-\mu_i)^T]=\frac{1}{N_i}\sum\limits_{x_j\in H_i}(x_j-\mu_i)(x_j-\mu_i)^T, \quad i=1,2, \cdots, c Σi=E[(xμi)(xμi)T]=Ni1xjHi(xjμi)(xjμi)T,i=1,2,,c

多维正态分布的等密度点轨迹为一超椭球面,等密度点满足下式:

( x − μ ) T Σ − 1 ( x − μ ) = 常 数 (x-\mu)^T\Sigma^{-1}(x-\mu)=常数 (xμ)TΣ1(xμ)=

进一步定义马氏距离的平方:

γ 2 = ( x − μ ) T Σ − 1 ( x − μ ) \gamma^2=(x-\mu)^T\Sigma^{-1}(x-\mu) γ2=(xμ)TΣ1(xμ)

对应马氏距离 γ 2 \gamma^2 γ2的超椭球体积为:

V = V d ∣ Σ ∣ 1 / 2 γ d V=V_d|\Sigma|^{1/2}\gamma^d V=VdΣ1/2γd

其中 V d V_d Vd d d d维超球体体积:

V d = { π d / 2 ( d 2 ) ! , d 为 偶 数 2 d π ( d − 1 ) / 2 ( d − 1 2 ) ! d ! , d 为 奇 数 V_d=\left\{ \begin{aligned} &\frac{\pi^{d/2}}{(\frac{d}{2})!},d为偶数 \\ &\frac{2^d\pi^{(d-1)/2}(\frac{d-1}{2})!}{d!},d为奇数 \end{aligned} \right. Vd=(2d)!πd/2dd!2dπ(d1)/2(2d1)!d

多元正态随机向量的线性变换仍为多元正态分布的随机向量:

即若 x ∼ N ( μ , Σ ) x \sim N(\mu,\Sigma) xN(μ,Σ) y = A x y=Ax y=Ax,有:

p ( y ) ∼ N ( A μ , A Σ A T ) p(y) \sim N(A\mu,A\Sigma A^T) p(y)N(Aμ,AΣAT)

2.5.2 正态分布概率模型下的最小错误率贝叶斯决策

其判别函数为:

g i ( x ) = ln ⁡ p ( x ∣ w i ) P ( w i ) = − 1 2 ( x − μ i ) T Σ i − 1 ( x − μ i ) − d 2 ln ⁡ 2 π − 1 2 ln ⁡ ∣ Σ i ∣ + ln ⁡ P ( ω i ) g_{i}(x)=\ln p(x \mid w_i)P(w_i)=-\frac{1}{2}\left(x-\mu_{i}\right)^{\mathrm{T}} \Sigma_{i}^{-1}\left(x-\mu_{i}\right)-\frac{d}{2} \ln 2 \pi-\frac{1}{2} \ln \left|\Sigma_{i}\right|+\ln P\left(\omega_{i}\right) gi(x)=lnp(xwi)P(wi)=21(xμi)TΣi1(xμi)2dln2π21lnΣi+lnP(ωi)

决策规则为:

i f   g i ( x ) = max ⁡ i = 1 , 2 , ⋯   , c g i ( x ) ,   t h e n   x ∈ w k if \ g_i(x)=\max_{i=1,2,\cdots, c}g_i(x), \ then \ x \in w_k if gi(x)=i=1,2,,cmaxgi(x), then xwk

其决策面为:

g i ( x ) = g j ( x ) g_i(x)=g_j(x) gi(x)=gj(x)

3×2种特殊情况下,多元正态分布的判别函数、决策规则(均是哪个类别的判别函数大,就分为哪一类),决策面(均是让两个判别函数相等):

i): Σ i = σ 2 I \Sigma_i=\sigma^2I Σi=σ2I

a): P ( w i ) = P ( w j ) P(w_i)=P(w_j) P(wi)=P(wj)

判函:

g i ( x ) = − ∥ x − μ i ∥ 2 g_{i}(x)=-\left\|x-\mu_{i}\right\|^{2} gi(x)=xμi2

决规:

i f   g i ( x ) = max ⁡ i = 1 , 2 , ⋯   , c g i ( x ) ,   t h e n   x ∈ w k if \ g_i(x)=\max_{i=1,2,\cdots, c}g_i(x), \ then \ x \in w_k if gi(x)=i=1,2,,cmaxgi(x), then xwk

( i . e .   min ⁡ i = 1 , 2 , ⋯   , c ∣ ∣ x − μ i ∣ ∣ 2 ,   m i n u m   d i s t a n c e   c l a s s i f e r ) (i.e. \ \min_{i=1,2,\cdots, c}||x-\mu_i||^2, \ minum \ distance \ classifer) (i.e. i=1,2,,cminxμi2, minum distance classifer)

决策面:

( μ i − μ j ) ⊤ ( x − 1 2 ( μ i + μ j ) ) = 0 \left(\mu_{i}-\mu_{j}\right)^{\top}\left(x-\frac{1}{2}\left(\mu_{i}+\mu_{j}\right)\right)=0 (μiμj)(x21(μi+μj))=0

( i . e .   w T ( x − x 0 ) = 0 ) (i.e. \ w^T(x-x_0)=0) (i.e. wT(xx0)=0)

b): P ( w i ) ≠ P ( w j ) P(w_i)\neq P(w_j) P(wi)=P(wj)

判函:

g i ( x ) = ( μ i σ 2 ) ⊤ x − 1 2 σ 2 μ i ⊤ μ i + ln ⁡ P ( w i ) = w ⊤ x + w i 0 \begin{aligned} g_{i}(x) &=\left(\frac{\mu_{i}}{\sigma^2}\right)^{\top} x-\frac{1}{2 \sigma^2} \mu_{i}^{\top} \mu_{i}+\ln P\left(w_{i}\right) \\ &=w^{\top} x+w_{i0} \end{aligned} gi(x)=(σ2μi)x2σ21μiμi+lnP(wi)=wx+wi0

决规:

i f   g k ( x ) = max ⁡ i = 1 , 2 , ⋯   , c g i ( x ) ,   t h e n   x ∈ w k if \ g_k(x)=\max_{i=1,2,\cdots, c}g_i(x), \ then \ x \in w_k if gk(x)=i=1,2,,cmaxgi(x), then xwk

决面:

( μ i − μ j ) T ( x − ( 1 2 ( μ i + μ j ) − σ 2 ∥ μ i − μ j ∥ ln ⁡ P ( ω i ) P ( w j ) ( μ i − μ j ) ) ) = 0 \left(\mu_{i}-\mu_{j}\right)^{T}\left(x-\left(\frac{1}{2}\left(\mu_{i}+\mu_{j}\right)-\frac{\sigma^{2}}{\left\|\mu_{i}-\mu_{j}\right\|} \ln \frac{P\left(\omega_{i}\right)}{P\left(w_{j}\right)}\left(\mu_{i}-\mu_{j}\right)\right)\right)=0 (μiμj)T(x(21(μi+μj)μiμjσ2lnP(wj)P(ωi)(μiμj)))=0

( i . e .   W T ( x − x 0 ) = 0 ) (i.e. \ W^T(x-x_0)=0) (i.e. WT(xx0)=0)

ii): Σ i = Σ \Sigma_i=\Sigma Σi=Σ

a):P(w_i)=P(w_j)

判函:

g i ( x ) = γ 2 = ( x − μ i ) ⊤ Σ − 1 ( x − μ i ) = ( Σ − 1 μ i ) ⊤ x − 1 2 μ i ⊤ Σ − 1 μ j g_{i}(x)=\gamma^{2}=\left(x-\mu_{i}\right)^{\top} \Sigma^{-1}\left(x-\mu_{i}\right)=\left(\Sigma^{-1} \mu_{i}\right)^{\top} x-\frac{1}{2} \mu_{i}^{\top} \Sigma^{-1} \mu_{j} gi(x)=γ2=(xμi)Σ1(xμi)=(Σ1μi)x21μiΣ1μj

决规:

i f   g k ( x ) = max ⁡ i = 1 , 2 , ⋯   , c g i ( x ) , t h e n   x ∈ w k if \ g_{k}(x)=\max _{i=1,2,\cdots,c} g_{i}(x), then \ x \in w_k if gk(x)=i=1,2,,cmaxgi(x),then xwk

决面:

( Σ − 1 ( μ i − μ 0 ) ) T ( x − 1 2 ( μ i + μ j ) ) = 0 \left(\Sigma^{-1}\left(\mu_{i}-\mu_{0}\right)\right)^{T}\left(x-\frac{1}{2}\left(\mu_{i}+\mu_{j}\right)\right)=0 (Σ1(μiμ0))T(x21(μi+μj))=0

( i . e .   w T ( x − x 0 ) = 0 ) (i.e. \ w^T(x-x_0)=0) (i.e. wT(xx0)=0)

b):P(w_i)≠P(w_j)

判函:

g i ( x ) = ( Σ − 1 μ i ) ⊤ x − 1 2 μ i ⊤ Σ − 1 μ j + ln ⁡ P ( w i ) = w ⊤ x + w i 0 \begin{aligned}g_{i}(x) &=\left(\Sigma^{-1} \mu_{i}\right)^{\top} x-\frac{1}{2} \mu_{i}^{\top} \Sigma^{-1} \mu_{j}+\ln P\left(w_{i}\right)\\&=w^{\top} x+w_{i 0}\end{aligned} gi(x)=(Σ1μi)x21μiΣ1μj+lnP(wi)=wx+wi0

决规:

i f   g k ( x ) = max ⁡ i = 1 , 2 , ⋯   , c g i ( x ) , t h e n   x ∈ w k if \ g_{k}(x)=\max _{i=1,2,\cdots,c} g_{i}(x), then \ x \in w_k if gk(x)=i=1,2,,cmaxgi(x),then xwk

决面:

[ Σ − 1 ( μ i − μ j ) ] ⊤ ( x − ( 1 2 ( μ i + μ j ) − ln ⁡ P ( w i ) P ( w j ) ( μ i − μ j ) ⊤ Σ − 1 ( μ i − μ j ) ( μ i − μ j ) ) ) = 0 \left[\Sigma^{-1}\left(\mu_{i}-\mu_{j}\right)\right]^{\top}\left(x-\left(\frac{1}{2}\left(\mu_{i}+\mu_{j}\right)-\frac{\ln \frac{P(w_i)}{P(w_j)}}{\left(\mu_{i}-\mu_{j}\right)^{\top} \Sigma^{-1}\left(\mu_{i}-\mu_{j}\right)}\left(\mu_{i}-\mu_{j}\right)\right)\right)\\=0 [Σ1(μiμj)]x21(μi+μj)(μiμj)Σ1(μiμj)lnP(wj)P(wi)(μiμj)=0

( i . e .   w T ( x − x 0 ) = 0 ) (i.e. \ w^T(x-x_0)=0) (i.e. wT(xx0)=0)

iii)各类的协方差阵不相等

判函:

g i ( x ) = x ⊤ ( − 1 2 Σ i − 1 ) x + ( Σ i − 1 μ i ) ⊤ x − 1 2 μ i ⊤ Σ i − 1 μ i − 1 2 ln ⁡ ∣ Σ i ∣ + ln ⁡ P ( w i ) \begin{aligned}g_{i}(x)=x^{\top}\left(-\frac{1}{2} \Sigma_{i}^{-1}\right) x+\left(\Sigma_{i}^{-1} \mu_{i}\right)^{\top} x &-\frac{1}{2} \mu_{i}^{\top} \Sigma_{i}^{-1} \mu_{i}-\frac{1}{2} \ln \left|\Sigma_{i}\right|+\ln P\left(w_{i}\right)\end{aligned} gi(x)=x(21Σi1)x+(Σi1μi)x21μiΣi1μi21lnΣi+lnP(wi)

( i . e .   q i ( x ) = x T W i x + w i i x + w i 0 ) (i.e. \ q_{i}(x)=x^{T} W_{i} x+w_{i}^{i} x+w_{i0}) (i.e. qi(x)=xTWix+wiix+wi0)

决规:

i f   g k ( x ) = max ⁡ i = 1 , 2 , ⋯   , c g i ( x ) , t h e n   x ∈ w k if \ g_{k}(x)=\max _{i=1,2,\cdots,c} g_{i}(x), then \ x \in w_k if gk(x)=i=1,2,,cmaxgi(x),then xwk

决规:

x ⊤ ( W i − W j ) x + ( w i − w j ) T x + w i 0 − w j 0 = 0 x^{\top}\left(W_{i}-W_{j}\right) x+\left(w_{i}-w_{j}\right)^{T} x+w_{i 0}-w_{j 0}=0 x(WiWj)x+(wiwj)Tx+wi0wj0=0

2.6 错误率的计算

2.6.1 正态分布且各类协方差矩阵相等情况下的错误率计算

正态分布且各类协方差矩阵相等情况下的错误率推导

正态分布且各类协方差矩阵相等情况下的错误率计算:

之前构造过最小错误率贝叶斯决策规则的负对数似然比形式:

h ( x ) = − ln ⁡ l ( x ) = ln ⁡ p ( x ∣ w 2 ) p ( x ∣ w 1 ) h(x)=-\ln l(x)=\ln \frac{p\left(x \mid w_{2}\right)}{ p\left(x \mid w_{1}\right)} h(x)=lnl(x)=lnp(xw1)p(xw2)

经推导,h(x)服从一维正态分布,并可求概密 p ( x ∣ w 1 ) p(x|w_1) p(xw1) p ( x ∣ w 2 ) p(x|w_2) p(xw2)

η = 1 2 [ ( μ 1 − μ 2 ) ⊤ Σ − 1 ( μ 1 − μ 2 ) ] \eta=\frac{1}{2}\left[\left(\mu_{1}-\mu_{2}\right)^{\top} \Sigma^{-1}\left(\mu_{1}-\mu_{2}\right)\right] η=21[(μ1μ2)Σ1(μ1μ2)]

则对于 p ( h ∣ w 1 ) , η 1 = − η , σ 1 2 = 2 η p\left(h \mid w_{1}\right), \quad \eta_{1}=-\eta, \quad \quad \sigma_{1}^{2}=2\eta p(hw1),η1=η,σ12=2η

对于 p ( h ∣ w 2 ) , η 1 = η , σ 1 2 = 2 η p\left(h \mid w_{2}\right), \quad \eta_{1}=\eta, \quad \quad \sigma_{1}^{2}=2\eta p(hw2),η1=η,σ12=2η

则:

P 1 ( e ) = ∫ t + ∞ p ( h ∣ w 1 ) d h = ∫ t + η σ + ∞ 1 2 π e − ζ 2 2 d ζ P_{1}(e)=\int_{t}^{+\infty} \quad p\left(h \mid w_{1}\right) d h=\int_{\frac{t+\eta}{\sigma}}^{+\infty} \frac{1}{\sqrt{2 \pi}} e^{-\frac{\zeta^2}{2}} d\zeta P1(e)=t+p(hw1)dh=σt+η+2π 1e2ζ2dζ

P 2 ( e ) = ∫ − ∞ t p ( h ∣ w 2 ) d h = ∫ − ∞ t − μ σ 1 2 π e − ζ 2 2 d ζ P_{2}(e)=\int_{-\infty}^{t} \quad p\left(h \mid w_{2}\right) d h=\int_{-\infty}^{\frac{t-\mu}{\sigma}} \frac{1}{\sqrt{2 \pi}} e^{-\frac{\zeta^{2}}{2}} d\zeta P2(e)=tp(hw2)dh=σtμ2π 1e2ζ2dζ

其中, t = ln ⁡ P ( w 1 ) p ( w 2 ) , σ = 2 η t=\ln \frac{P\left(w_{1}\right)}{p\left(w_{2}\right)}, \quad \sigma=\sqrt{2\eta} t=lnp(w2)P(w1),σ=2η

则最终:

P ( e ) = P ( w 1 ) ⋅ P 1 ( e ) + P ( w 2 ) ⋅ P 2 ( e ) P(e)=P\left(w_{1}\right) \cdot P_{1}(e)+P\left(w_{2}\right) \cdot P_{2}(e) P(e)=P(w1)P1(e)+P(w2)P2(e)

2.6.2 高维独立随机变量时错误率的估计

高维独立随机变量时错误率的推导

高维独立随机变量时的错误率估计:

( h ( x ) ∣ ω i ) ∼ N ( η i , σ i 2 ) (h(x) \mid \omega_{i}) \sim N\left(\eta_{i}, \sigma_{i}^{2}\right) (h(x)ωi)N(ηi,σi2)

其中:

η i = ∑ i = 1 d η i l \eta_i=\sum_{i=1}^{d}\eta_{il} ηi=i=1dηil

σ i 2 = ∑ i = 1 d σ i l 2 \sigma_{i}^{2}=\sum_{i=1}^{d} \sigma_{i l}^{2} σi2=i=1dσil2

2.7 离散时间序列样本的统计决策

2.7.2 马尔科夫模型及在马尔科夫模型下的贝叶斯决策

△△△△△离散变量的概率模型估计问题△△△△

一阶马尔科夫链:

P ( x i ∣ x i − 1 , x i − 2 , ⋯   , x 1 ) = P ( x i ∣ x i − 1 ) P(x_i\mid x_{i-1},x_{i-2},\cdots,x_1)=P(x_i\mid x_{i-1}) P(xixi1,xi2,,x1)=P(xixi1)

转移概率:

a s t = P ( x i = t ∣ x i − 1 = s ) a_{st}=P(x_i=t\mid x_{i-1}=s) ast=P(xi=txi1=s)

观察到指定序列的概率为:

P ( x ) = P ( x 0 , x 1 , ⋯   , x L ) = ∏ i = 2 L a x i − 1 x i P(x)=P(x_0,x_1,\cdots,x_L)=\prod\limits_{i=2}^La_{x_{i-1}x_i} P(x)=P(x0,x1,,xL)=i=2Laxi1xi

一阶马尔科夫链的对数似然比判别:

S ( x ) = log ⁡ P ( x ∣ + ) P ( x ∣ − ) = ∑ i = 1 L log ⁡ a x i − 1 x i + a x i − 1 − x i = ∑ i = 1 L β x i − 1 x i S(x)=\log\frac{P(x \mid +)}{P(x \mid -)}=\sum_{i=1}^{L} \log \frac{a_{x_{i-1} x_{i}}^{+}}{a_{x_{i-1}}^{-} x_{i}}={\sum_{i=1}^{L}} \beta_{x_{i-1} x_{i}} S(x)=logP(x)P(x+)=i=1Llogaxi1xiaxi1xi+=i=1Lβxi1xi

状态转移矩阵的估计:

a s t + = c s t + ∑ t ′ c c s t ′ + a s t − = c s t − ∑ t ′ c c s t ′ − a_{st}^+=\frac{c^+_{st}}{\sum_{t'}c^+_{cst'}}\\a_{st}^-=\frac{c^-_{st}}{\sum_{t'}c^-_{cst'}} ast+=tccst+cst+ast=tccstcst

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
模式识别第四版)》是由学工教授编著的一本关于模式识别领域的经典教材。该书系统介绍了模式识别的基本概念、原理和方法,为读者提供了全面而深入的学习材料。 该书首先介绍了模式识别的基本概念和发展历程,让读者对该领域有一个整体的了解。接着,书中详细讲解了常用的模式识别技术,如特征提取、分类器设计和模型评估等。通过具体的案例和算法实现,读者可以深入了解这些技术的原理和应用。 在第四版中,学工教授对书中的内容进行了全面的更新和扩展。他引入了最新的研究成果和应用案例,以反映模式识别领域的最新发展。同时,他还加入了更多的实例和习题,帮助读者巩固所学知识。 该书的特点之一是注重理论和实践相结合。学工教授不仅详细解释了各种模式识别方法的原理,还给出了大量的实验和实例,让读者能够亲自动手实践。这样的学习方式能够帮助读者更好地理解和掌握模式识别的理论和方法。 总之,《模式识别第四版)》是一本全面而深入的模式识别教材,适合作为高等院校计算机科学、电子工程和人工智能等专业的教材,也适合作为相关领域从业人员的参考书。无论是对模式识别感兴趣的读者,还是对该领域有一定了解的专业人士,都可以从中获取到宝贵的知识和经验。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值