生成模型
一、原理
模型
朴素贝叶斯法利用贝叶斯定理与学到的联合概率模型进行分类预测。
根据贝叶斯定理,得后验概率
(*) P ( Y ∣ X ) = P ( X , Y ) P ( X ) = P ( Y ) P ( X ∣ Y ) ∑ Y P ( Y ) P ( X ∣ Y ) P(Y|X)=\frac{P(X,Y)}{P(X)}=\frac{P(Y)P(X|Y)}{\sum\limits_YP(Y)P(X|Y)}\tag{*} P(Y∣X)=P(X)P(X,Y)=Y∑P(Y)P(X∣Y)P(Y)P(X∣Y)(*)
即
(A) P ( Y = c k ∣ X = x ) = P ( Y = c k ) P ( X = x ∣ Y = c k ) ∑ k P ( Y = c k ) P ( X = x ∣ Y = c k ) P(Y=c_k|X=x)=\frac{P(Y=c_k)P(X=x|Y=c_k)}{\sum\limits_kP(Y=c_k)P(X=x|Y=c_k)}\tag{A} P(Y=ck∣X=x)=k∑P(Y=ck)P(X=x∣Y=ck)P(Y=ck)P(X=x∣Y=ck)(A)
联合概率分布: P ( X , Y ) = P ( Y ) P ( X ∣ Y ) P(X,Y)=P(Y)P(X|Y) P(X,Y)=P(Y)P(X∣Y),其由先验概率与条件概率计算得来。
先验概率分布: P ( Y = c k ) P(Y=c_k) P(Y=ck)
条件概率分布: P ( X = x ∣ Y = c k ) = P ( X ( 1 ) = x ( 1 ) , ⋅ ⋅ ⋅ , X ( n ) = x ( n ) ∣ Y = c k ) P(X=x|Y=c_k)=P(X^{(1)}=x^{(1)},···,X^{(n)}=x^{(n)}|Y=c_k) P(X=x∣Y=ck)=P(X(1)=x(1),⋅⋅⋅,X(n)=x(n)∣Y=ck)
根据朴素贝叶斯法的条件独立性假设,将条件概率的计算公式简化为 P ( X = x ∣ Y = c k ) = ∏ j = 1 n P ( X ( j ) = x ( j ) ∣ Y = c k ) P(X=x|Y=c_k)=\prod\limits_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_k) P(X=x∣Y=ck)=j=1∏nP(X(j)=x(j)∣Y=ck)
将此式代入式 ( A ) (A) (A),得
(B) P ( Y = c k ∣ X = x ) = P ( Y = c k ) ∏ j = 1 n P ( X ( j ) = x ( j ) ∣ Y = c k ) ∑ k P ( Y = c k ) ∏ j = 1 n P ( X ( j ) = x ( j ) ∣ Y = c k ) P(Y=c_k|X=x)=\frac{P(Y=c_k)\prod\limits_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_k)}{\sum\limits_kP(Y=c_k)\prod\limits_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_k)}\tag{B} P(Y=ck∣X=