第四章 朴素贝叶斯法

参考资料: 李航《统计学习方法》

朴素贝叶斯法是基于贝叶斯定理与特征条件独立假设的分类方法

对于给定的训练数据集,首先基于特征条件独立假设学习输入/输出的联合概率分布 p ( x , y ) p(x,y) p(x,y);然后基于此模型,对给定的输入 x x x,利用贝叶斯定理求出后验概率 p ( y ∣ x ) p(y|x) p(yx)最大的输出 y y y
利用训练数据学习 p ( x ∣ y ) p(x|y) p(xy) p ( y ) p(y) p(y)的估计,得到联合概率分布: p ( x , y ) = p ( y ) p ( x ∣ y ) p(x,y)=p(y)p(x|y) p(x,y)=p(y)p(xy)
概率估计可以使极大似然估计或贝叶斯估计

基本假设

朴素贝叶斯法的基本假设是条件独立性,
P ( X = x ∣ Y = c k ) = P ( X ( 1 ) = x ( 1 ) , X ( 2 ) = x ( 2 ) , . . . , X ( n ) = x ( n ) ∣ Y = c k ) = ∏ j = 1 n P ( X ( j ) = x ( j ) ∣ Y = c k ) \begin{aligned} P(X=x|Y=c_{k})&=P(X^{(1)}=x^{(1)},X^{(2)}=x^{(2)},...,X^{(n)}=x^{(n)}|Y=c_{k})\\ &=\prod \limits_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_{k}) \end{aligned} P(X=xY=ck)=P(X(1)=x(1),X(2)=x(2),...,X(n)=x(n)Y=ck)=j=1nP(X(j)=x(j)Y=ck)
这是一个较强的假设,由于这一假设,模型包含的条件概率的数量大为减少,朴素贝叶斯法的学习与预测大为简化,高效易于实现,然而分类的性能不一定很高

P ( Y ∣ X ) = P ( X , Y ) P ( X ) = P ( Y ) P ( X ∣ Y ) ∑ Y P ( Y ) P ( X ∣ Y ) P(Y|X)=\frac {P(X,Y)}{P(X)}=\frac {P(Y)P(X|Y)}{\sum \limits_{Y}P(Y)P(X|Y)} P(YX)=P(X)P(X,Y)=YP(Y)P(XY)P(Y)P(XY)
将输入 x x x分到后验概率最大的类 y y y
y = a r g max ⁡ c k P ( Y = c k ) ∏ j = 1 n P ( X ( j ) = x ( j ) ∣ Y = c k ) y=arg\max \limits{_{c_{k}}P(Y=c_{k})}\prod \limits_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_{k}) y=argmaxckP(Y=ck)j=1nP(X(j)=x(j)Y=ck)
后验概率最大等价于0-1损失函数时的期望风险最小化
朴素贝叶斯法实际上学习到的生成数据的机制,所以属于生成模型
条件独立假设等于说用于分类的特征在类确定的条件下都是独立的,这一假设使朴素贝叶斯法变得简单,但有时会牺牲一定的分类准确率。

极大似然估计

先验概率 P ( Y = c k ) P(Y=c_{k}) P(Y=ck)的极大似然估计 P ( Y = c k ) = ∑ i = 1 N I ( y i = c k ) N , k = 1 , 2 , . . . , K P(Y=c_{k})=\frac{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})}{N},k=1,2,...,K P(Y=ck)=Ni=1NI(yi=ck),k=1,2,...,K
设第 j j j个特征 x ( j ) x^{(j)} x(j)可能取值的集合为 { a j 1 , a j 2 , . . . , a j S j } \{a_{j1},a_{j2},...,a_{jS_j}\} {aj1,aj2,...,ajSj}
条件概率 P ( X ( j ) = a j l ∣ Y = c k ) P(X^{(j)}=a_{jl}|Y=c_{k}) P(X(j)=ajlY=ck)的极大似然估计 P ( X ( j ) = a j l ∣ Y = c k ) = ∑ i = 1 N I ( x i ( j ) = a j l , y i = c k ) ∑ i = 1 N I ( y i = c k ) P(X^{(j)}=a_{jl}|Y=c_{k})=\frac {\sum \limits_{i=1}^{N}I(x_{i}^{(j)}=a_{jl},y_{i}=c_{k})}{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})} P(X(j)=ajlY=ck)=i=1NI(yi=ck)i=1NI(xi(j)=ajl,yi=ck) j = 1 , 2 , . . . , n ; l = 1 , 2 , . . . , S j ; k = 1 , 2 , . . . , K j=1,2,...,n;l=1,2,...,S_{j};k=1,2,...,K j=1,2,...,n;l=1,2,...,Sj;k=1,2,...,K
x i ( j ) x_{i}^{(j)} xi(j)是第 i i i个样本的第 j j j个特征; a j l a_{jl} ajl是第 j j j个特征可能取的第 l l l个值; I I I为指示函数

贝叶斯估计

朴素贝叶斯法与贝叶斯估计是不同的概念

用极大似然估计可能会出现所要估计的概率值为0的情况,采用贝叶斯估计来解决这一问题
条件概率的贝叶斯估计是 P λ ( X ( j ) = a j l ∣ Y = c k ) = ∑ i = 1 N I ( x i ( j ) = a j l , y i = c k ) + λ ∑ i = 1 N I ( y i = c k ) + S i λ P_{\lambda}(X^{(j)}=a_{jl}|Y=c_{k})=\frac {\sum \limits_{i=1}^{N}I(x_{i}^{(j)}=a_{jl},y_{i}=c_{k})+\lambda}{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})+S_{i}\lambda} Pλ(X(j)=ajlY=ck)=i=1NI(yi=ck)+Siλi=1NI(xi(j)=ajl,yi=ck)+λ
式中 λ > 0 \lambda>0 λ>0,常取 λ = 1 \lambda=1 λ=1,这时称为拉普拉斯平滑,显然有
P λ ( X ( j ) = a j l ∣ Y = c k ) > 0 P_{\lambda}(X^{(j)}=a_{jl}|Y=c_{k})>0 Pλ(X(j)=ajlY=ck)>0
∑ l = 1 S j P λ ( X ( j ) = a j l ∣ Y = c k ) = 1 \sum \limits_{l=1}^{S_{j}}P_{\lambda}(X^{(j)}=a_{jl}|Y=c_{k})=1 l=1SjPλ(X(j)=ajlY=ck)=1
l = 1 , 2 , . . . , S j , k = 1 , 2 , . . . , K l=1,2,...,S_{j},k=1,2,...,K l=1,2,...,Sj,k=1,2,...,K
表明贝叶斯估计是一种概率分布。同理,先验概率的贝叶斯估计是 P λ ( Y = c k ) = ∑ i = 1 N I ( y i = c k ) + λ N + K λ , k = 1 , 2 , . . . , K P_{\lambda}(Y=c_{k})=\frac{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})+\lambda}{N+K\lambda},k=1,2,...,K Pλ(Y=ck)=N+Kλi=1NI(yi=ck)+λ,k=1,2,...,K

朴素贝叶斯算法

输入:训练数据 T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } T=\{(x_1,y_1),(x_2,y_2),...,(x_N,y_N)\} T={(x1,y1),(x2,y2),...,(xN,yN)},其中 x i = ( x i ( 1 ) , x i ( 2 ) , . . . , x i ( N ) ) x_{i}=(x_{i}^{(1)},x_{i}^{(2)},...,x_{i}^{(N)}) xi=(xi(1),xi(2),...,xi(N)), x i ( j ) x_{i}^{(j)} xi(j)是第 i i i个样本的第 j j j个特征, x i ( j ) ∈ { a j 1 , a j 2 , . . . , a j S j } x_{i}^{(j)}\in \{a_{j1},a_{j2},...,a_{jS_{j}}\} xi(j){aj1,aj2,...,ajSj}, a j l a_{jl} ajl是第 j j j个特征可能取的第 l l l个值, j = 1 , 2 , . . . , n , l = 1 , 2 , . . . , S j , y i ∈ { c 1 , c 2 , . . . , c K } j=1,2,...,n,l=1,2,...,S_{j},y_{i}\in\{c_{1},c_2,...,c_K\} j=1,2,...,n,l=1,2,...,Sj,yi{c1,c2,...,cK};实例 x x x;

输出:实例 x x x的分类

(1)计算先验概率及条件概率
P ( Y = c k ) = ∑ i = 1 N I ( y i = c k ) N , k = 1 , 2 , . . . , K P(Y=c_{k})=\frac{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})}{N},k=1,2,...,K P(Y=ck)=Ni=1NI(yi=ck),k=1,2,...,K
P ( X ( j ) = a j l ∣ Y = c k ) = ∑ i = 1 N I ( x i ( j ) = a j l , y i = c k ) ∑ i = 1 N I ( y i = c k ) P(X^{(j)}=a_{jl}|Y=c_{k})=\frac {\sum \limits_{i=1}^{N}I(x_{i}^{(j)}=a_{jl},y_{i}=c_{k})}{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})} P(X(j)=ajlY=ck)=i=1NI(yi=ck)i=1NI(xi(j)=ajl,yi=ck)
j = 1 , 2 , . . . , n ; l = 1 , 2 , . . . , S j ; k = 1 , 2 , . . . , K j=1,2,...,n;l=1,2,...,S_{j};k=1,2,...,K j=1,2,...,n;l=1,2,...,Sj;k=1,2,...,K
(2)对于给定的实例 x = ( x ( 1 ) , x ( 2 ) , . . . , x ( N ) ) x=(x^{(1)},x^{(2)},...,x^{(N)}) x=(x(1),x(2),...,x(N)),计算 P ( Y = c k ) ∏ j = 1 n P ( X ( j ) = x ( j ) ∣ Y = c k ) , k = 1 , 2 , . . . , K P(Y=c_{k})\prod \limits_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_{k}),k=1,2,...,K P(Y=ck)j=1nP(X(j)=x(j)Y=ck),k=1,2,...,K
(3)确定实例 x x x的类 y = a r g max ⁡ c k P ( Y = c k ) ∏ j = 1 n P ( X ( j ) = x ( j ) ∣ Y = c k ) y=arg\max \limits{_{c_{k}}P(Y=c_{k})}\prod \limits_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_{k}) y=argmaxckP(Y=ck)j=1nP(X(j)=x(j)Y=ck)

习题4.1

习题:用极大似然估计法推出朴素贝叶斯中的概率估计公式
P ( Y = c k ) = ∑ i = 1 N I ( y i = c k ) N , k = 1 , 2 , . . . , K P(Y=c_{k})=\frac{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})}{N},k=1,2,...,K P(Y=ck)=Ni=1NI(yi=ck),k=1,2,...,K
P ( X ( j ) = a j l ∣ Y = c k ) = ∑ i = 1 N I ( x i ( j ) = a j l , y i = c k ) ∑ i = 1 N I ( y i = c k ) P(X^{(j)}=a_{jl}|Y=c_{k})=\frac {\sum \limits_{i=1}^{N}I(x_{i}^{(j)}=a_{jl},y_{i}=c_{k})}{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})} P(X(j)=ajlY=ck)=i=1NI(yi=ck)i=1NI(xi(j)=ajl,yi=ck)

解答:把 P ( Y = c k ) , P ( X ( j ) = a j l ∣ Y = c k ) P(Y=c_{k}),P(X^{(j)}=a_{jl}|Y=c_{k}) P(Y=ck)P(X(j)=ajlY=ck)当做参数, ∑ k = 1 K P ( y = c k ) = 1 \sum \limits_{k=1}^{K}P(y=c_k)=1 k=1KP(y=ck)=1作为约束条件来求解参数值

由假设可知: P ( y ) = ∏ k = 1 K P ( y = c k ) I ( y = c k ) P(y)=\prod \limits_{k=1}^{K}P(y=c_{k})^{I(y=c_{k})} P(y)=k=1KP(y=ck)I(y=ck)
P ( x ∣ y = c k ) = ∏ j = 1 n P ( x ( j ) ∣ y = c k ) = ∏ j = 1 n ∏ l = 1 S j P ( x ( j ) = a j l ∣ y = c k ) I ( x ( j ) = a j l , y = c k ) P(x|y=c_k)=\prod \limits_{j=1}^{n}P(x^{(j)}|y=c_{k})=\prod \limits_{j=1}^{n} \prod \limits_{l=1}^{S_j}P(x^{(j)}=a_{jl}|y=c_{k})^{I(x^{(j)}=a_{jl},y=c_k)} P(xy=ck)=j=1nP(x(j)y=ck)=j=1nl=1SjP(x(j)=ajly=ck)I(x(j)=ajl,y=ck)
φ = { P ( Y = c k ) , P ( X ( j ) = a j l ∣ Y = c k ) } \varphi = \{P(Y=c_{k}),P(X^{(j)}=a_{jl}|Y=c_{k})\} φ={P(Y=ck)P(X(j)=ajlY=ck)},对数似然函数为:
L ( φ ) = l o g ∏ i = 1 N P ( x i , y i ; φ ) = l o g ∏ i = 1 N P ( x i ∣ y i ; φ ) P ( y i ; φ ) = l o g ∏ i = 1 N ∏ j = 1 n P ( x i ( j ) ∣ y i ; φ ) P ( y i ; φ ) = ∑ i = 1 N ( P ( y i ; φ ) + ∑ j = 1 n P ( x i ( j ) ∣ y i ; φ ) ) = ∑ i = 1 N [ ∑ k = 1 K l o g P ( y = c k ) I ( y i = c k ) + ∑ j = 1 n ∑ l = 1 S j ∑ k = 1 K l o g P ( x i ( j ) = a j l ∣ y i = c k ) I ( x i ( j ) = a j l , y i = c k ) ] = ∑ i = 1 N [ ∑ k = 1 K I ( y i = c k ) l o g P ( y = c k ) + ∑ j = 1 n ∑ l = 1 S j ∑ k = 1 K I ( x i ( j ) = a j l , y i = c k ) l o g P ( x i ( j ) = a j l ∣ y i = c k ) ] \begin{aligned} L(\varphi)&=log\prod \limits_{i=1}^{N}P(x_i,y_i;\varphi)=log\prod \limits_{i=1}^{N}P(x_i|y_i;\varphi)P(y_{i};\varphi)\\ &=log\prod \limits_{i=1}^{N} \prod \limits_{j=1}^{n}P(x_i^{(j)}|y_i;\varphi)P(y_{i};\varphi)\\ &=\sum \limits_{i=1}^{N} (P(y_{i};\varphi) + \sum \limits_{j=1}^{n}P(x_i^{(j)}|y_i;\varphi))\\ &=\sum \limits_{i=1}^{N} [\sum \limits_{k=1}^{K}logP(y=c_k)^{I(y_i=c_k)} + \sum \limits_{j=1}^{n} \sum \limits_{l=1}^{S_j}\sum \limits_{k=1}^{K}log P(x_i^{(j)}=a_{jl}|y_i=c_k)^{I(x_i^{(j)}=a_{jl},y_i=c_k)}]\\ &=\sum \limits_{i=1}^{N} [\sum \limits_{k=1}^{K}{I(y_i=c_k)}logP(y=c_k) + \sum \limits_{j=1}^{n} \sum \limits_{l=1}^{S_j}\sum \limits_{k=1}^{K}{I(x_i^{(j)}=a_{jl},y_i=c_k)}logP(x_i^{(j)}=a_{jl}|y_i=c_k)] \end{aligned} L(φ)=logi=1NP(xi,yi;φ)=logi=1NP(xiyi;φ)P(yi;φ)=logi=1Nj=1nP(xi(j)yi;φ)P(yi;φ)=i=1N(P(yi;φ)+j=1nP(xi(j)yi;φ))=i=1N[k=1KlogP(y=ck)I(yi=ck)+j=1nl=1Sjk=1KlogP(xi(j)=ajlyi=ck)I(xi(j)=ajl,yi=ck)]=i=1N[k=1KI(yi=ck)logP(y=ck)+j=1nl=1Sjk=1KI(xi(j)=ajl,yi=ck)logP(xi(j)=ajlyi=ck)]
关于第一个参数 P ( Y = c k ) P(Y=c_{k}) P(Y=ck)求导: ∂ L ( φ ) ∂ P ( y = c k ) = ∂ ∂ P ( y = c k ) ∑ i = 1 N ∑ k = 1 K I ( y i = c k ) l o g P ( y = c k ) \frac {\partial {L(\varphi)}}{\partial P(y=c_k)}=\frac {\partial}{\partial P(y=c_k)}\sum \limits_{i=1}^{N}\sum \limits_{k=1}^{K}{I(y_i=c_k)}logP(y=c_k) P(y=ck)L(φ)=P(y=ck)i=1Nk=1KI(yi=ck)logP(y=ck)
由约束条件可知: P ( y = c K ) = 1 − ∑ k = 1 K − 1 P ( y = c k ) P(y=c_K)=1-\sum \limits_{k=1}^{K-1}P(y=c_k) P(y=cK)=1k=1K1P(y=ck)
⇒ ∂ L ( φ ) ∂ P ( y = c k ) = ∂ ∂ P ( y = c k ) ∑ i = 1 N [ ∑ k = 1 K − 1 I ( y i = c k ) l o g P ( y = c k ) + I ( y i = c K ) l o g P ( y = c K ) ] = ∂ ∂ P ( y = c k ) ∑ i = 1 N [ ∑ k = 1 K − 1 I ( y i = c k ) l o g P ( y = c k ) + I ( y i = c K ) l o g ( 1 − ∑ k = 1 K − 1 P ( y = c k ) ) ] \Rightarrow\frac {\partial {L(\varphi)}}{\partial P(y=c_k)}=\frac {\partial}{\partial P(y=c_k)}\sum \limits_{i=1}^{N}[\sum \limits_{k=1}^{K-1}{I(y_i=c_k)}logP(y=c_k)+I(y_i=c_K)logP(y=c_K)]\\ =\frac {\partial}{\partial P(y=c_k)}\sum \limits_{i=1}^{N}[\sum \limits_{k=1}^{K-1}{I(y_i=c_k)}logP(y=c_k)+I(y_i=c_K)log(1-\sum \limits_{k=1}^{K-1}P(y=c_k))] P(y=ck)L(φ)=P(y=ck)i=1N[k=1K1I(yi=ck)logP(y=ck)+I(yi=cK)logP(y=cK)]=P(y=ck)i=1N[k=1K1I(yi=ck)logP(y=ck)+I(yi=cK)log(1k=1K1P(y=ck))]
先来求 P ( y = c 1 ) P(y=c_1) P(y=c1)的估计值:
0 = ∂ ∂ P ( y = c 1 ) ∑ i = 1 N [ ∑ k = 1 K − 1 I ( y i = c k ) l o g P ( y = c k ) + I ( y i = c K ) l o g ( 1 − ∑ k = 1 K − 1 P ( y = c k ) ) ] = ∑ i = 1 N [ I ( y i = c 1 ) P ( y = c 1 ) − I ( y i = c K ) 1 − ∑ a = 1 K − 1 P ( y = c a ) ] = ∑ i = 1 N [ I ( y i = c 1 ) P ( y = c 1 ) − I ( y i = c K ) P ( y = c K ) ] \begin{aligned} 0&=\frac {\partial}{\partial P(y=c_1)}\sum \limits_{i=1}^{N}[\sum \limits_{k=1}^{K-1}{I(y_i=c_k)}logP(y=c_k)+I(y_i=c_K)log(1-\sum \limits_{k=1}^{K-1}P(y=c_k))]\\ &=\sum \limits_{i=1}^{N}[\frac{I(y_i=c_1)}{P(y=c_1)}-\frac{I(y_i=c_K)}{1-\sum\limits_{a=1}^{K-1}P(y=c_a)}]\\ &=\sum \limits_{i=1}^{N}[\frac{I(y_i=c_1)}{P(y=c_1)}-\frac{I(y_i=c_K)}{P(y=c_K)}] \end{aligned} 0=P(y=c1)i=1N[k=1K1I(yi=ck)logP(y=ck)+I(yi=cK)log(1k=1K1P(y=ck))]=i=1N[P(y=c1)I(yi=c1)1a=1K1P(y=ca)I(yi=cK)]=i=1N[P(y=c1)I(yi=c1)P(y=cK)I(yi=cK)]

P ( y = c K ) P(y=c_K) P(y=cK)在此为由 P ( y = c 1 ) , P ( y = c 2 ) , . . . , P ( y = c K − 1 ) P(y=c_1),P(y=c_2),...,P(y=c_{K-1}) P(y=c1)P(y=c2),...,P(y=cK1)决定的一个值
∑ i = 1 N [ I ( y i = c 1 ) P ( y = c 1 ) − I ( y i = c K ) P ( y = c K ) ] = 0 \begin{aligned} \sum \limits_{i=1}^{N}[\frac{I(y_i=c_1)}{P(y=c_1)}-\frac{I(y_i=c_K)}{P(y=c_K)}]=0 \\ \end{aligned} i=1N[P(y=c1)I(yi=c1)P(y=cK)I(yi=cK)]=0 ⇒ P ( y = c K ) ∑ i = 1 N I ( y i = c 1 ) − P ( y = c 1 ) ∑ i = 1 N I ( y i = c K ) = 0 \begin{aligned} \Rightarrow P(y=c_K)\sum \limits_{i=1}^{N}I(y_i=c_1)-P(y=c_1)\sum \limits_{i=1}^{N}I(y_i=c_K)=0\\ \end{aligned} P(y=cK)i=1NI(yi=c1)P(y=c1)i=1NI(yi=cK)=0 P ( y = c 1 ) = ∑ i = 1 N I ( y i = c 1 ) ∑ i = 1 N I ( y i = c K ) P ( y = c K ) P ( y = c 2 ) = ∑ i = 1 N I ( y i = c 2 ) ∑ i = 1 N I ( y i = c K ) P ( y = c K ) . . . . . . P ( y = c K ) = ∑ i = 1 N I ( y i = c K ) ∑ i = 1 N I ( y i = c K ) P ( y = c K ) \begin{aligned} P(y=c_1) &= \frac {\sum \limits_{i=1}^{N}I(y_i=c_1)}{\sum \limits_{i=1}^{N}I(y_i=c_K)} P(y=c_K)\\ P(y=c_2) &= \frac {\sum \limits_{i=1}^{N}I(y_i=c_2)}{\sum \limits_{i=1}^{N}I(y_i=c_K)} P(y=c_K)\\ &...... \\ P(y=c_K) &= \frac {\sum \limits_{i=1}^{N}I(y_i=c_K)}{\sum \limits_{i=1}^{N}I(y_i=c_K)} P(y=c_K) \end{aligned} P(y=c1)P(y=c2)P(y=cK)=i=1NI(yi=cK)i=1NI(yi=c1)P(y=cK)=i=1NI(yi=cK)i=1NI(yi=c2)P(y=cK)......=i=1NI(yi=cK)i=1NI(yi=cK)P(y=cK)

累加上式 P ( y = c 1 ) , P ( y = c 2 ) , . . . , P ( y = c K ) P(y=c_1),P(y=c_2),...,P(y=c_K) P(y=c1),P(y=c2),...,P(y=cK)得到:
P ( y = c 1 ) + P ( y = c 2 ) + . . . + P ( y = c K ) = N ∑ i = 1 N I ( y i = c K ) P ( y = c K ) P(y=c_1)+P(y=c_2)+...+P(y=c_K)=\frac{N}{\sum \limits_{i=1}^{N}I(y_i=c_K)} P(y=c_K) P(y=c1)+P(y=c2)+...+P(y=cK)=i=1NI(yi=cK)NP(y=cK)
⇒ 1 = N ∑ i = 1 N I ( y i = c K ) P ( y = c K ) \Rightarrow 1=\frac{N}{\sum \limits_{i=1}^{N}I(y_i=c_K)} P(y=c_K) 1=i=1NI(yi=cK)NP(y=cK)
⇒ P ( y = c K ) = ∑ i = 1 N I ( y i = c K ) N \Rightarrow P(y=c_K)=\frac{\sum \limits_{i=1}^{N}I(y_i=c_K)} {N} P(y=cK)=Ni=1NI(yi=cK)
同理可得: P ( y = c k ) = ∑ i = 1 N I ( y i = c k ) N , k = 1 , 2 , . . , K P(y=c_k)=\frac{\sum \limits_{i=1}^{N}I(y_i=c_k)} {N},k=1,2,..,K P(y=ck)=Ni=1NI(yi=ck),k=1,2,..,K
同理对 P ( X ( j ) = a j l ∣ Y = c k ) P(X^{(j)}=a_{jl}|Y=c_{k}) P(X(j)=ajlY=ck)求导,可得 P ( X ( j ) = a j l ∣ Y = c k ) = ∑ i = 1 N I ( x i ( j ) = a j l , y i = c k ) ∑ i = 1 N I ( y i = c k ) P(X^{(j)}=a_{jl}|Y=c_{k})=\frac {\sum \limits_{i=1}^{N}I(x_{i}^{(j)}=a_{jl},y_{i}=c_{k})}{\sum \limits_{i=1}^{N}I(y_{i}=c_{k})} P(X(j)=ajlY=ck)=i=1NI(yi=ck)i=1NI(xi(j)=ajl,yi=ck)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值