# 朴素贝叶斯最好的例子与推导

http://zh.wikipedia.org/wiki/%E6%9C%B4%E7%B4%A0%E8%B4%9D%E5%8F%B6%E6%96%AF%E5%88%86%E7%B1%BB%E5%99%A8

## 朴素贝叶斯概率模型

$p(C \vert F_1,\dots,F_n)\,$

$p(C \vert F_1,\dots,F_n) = \frac{p(C) \ p(F_1,\dots,F_n\vert C)}{p(F_1,\dots,F_n)}. \,$

$\mbox{posterior} = \frac{\mbox{prior} \times \mbox{likelihood}}{\mbox{evidence}}. \,$

$p(C \vert F_1, \dots, F_n)\,$

$p(C \vert F_1, \dots, F_n)\,$
$\varpropto p(C) \ p(F_1,\dots,F_n\vert C)$
$\varpropto p(C) \ p(F_1\vert C) \ p(F_2,\dots,F_n\vert C, F_1)$
$\varpropto p(C) \ p(F_1\vert C) \ p(F_2\vert C, F_1) \ p(F_3,\dots,F_n\vert C, F_1, F_2)$
$\varpropto p(C) \ p(F_1\vert C) \ p(F_2\vert C, F_1) \ p(F_3\vert C, F_1, F_2) \ p(F_4,\dots,F_n\vert C, F_1, F_2, F_3)$
$\varpropto p(C) \ p(F_1\vert C) \ p(F_2\vert C, F_1) \ p(F_3\vert C, F_1, F_2) \ \dots p(F_n\vert C, F_1, F_2, F_3,\dots,F_{n-1}).$

$p(F_i \vert C, F_j) = p(F_i \vert C)\,$

\begin{align}p(C \vert F_1, \dots, F_n) & \varpropto p(C) \ p(F_1\vert C) \ p(F_2\vert C) \ p(F_3\vert C) \ \cdots\, \\& \varpropto p(C) \prod_{i=1}^n p(F_i \vert C).\,\end{align}

$p(C \vert F_1,\dots,F_n) = \frac{1}{Z} p(C) \prod_{i=1}^n p(F_i \vert C)$

## [编辑]

### 性别分类

#### [编辑]训练

6 180 12
5.92 (5'11") 190 11
5.58 (5'7") 170 12
5.92 (5'11") 165 10
5 100 6
5.5 (5'6") 150 8
5.42 (5'5") 130 7
5.75 (5'9") 150 9

#### [编辑]测试

sample 6 130 8

$posterior (male) = \frac{P(male) \, p(height | male) \, p(weight | male) \, p(foot size | male)}{evidence}$

$posterior (female) = \frac{P(female) \, p(height | female) \, p(weight | female) \, p(foot size | female)}{evidence}$

$evidence = P(male) \, p(height | male) \, p(weight | male) \, p(foot size | male) + P(female) \, p(height | female) \, p(weight | female) \, p(foot size | female)$

$P(male) = 0.5$

$p(\mbox{height} | \mbox{male}) = \frac{1}{\sqrt{2\pi \sigma^2}}\exp\left(\frac{-(6-\mu)^2}{2\sigma^2}\right) \approx 1.5789$,其中$\mu = 5.855$$\sigma^2 = 3.5033e^{-02}$是训练集样本的正态分布参数. 注意，这里的值大于1也是允许的 – 这里是概率密度而不是概率，因为身高是一个连续的变量.

$p(weight | male) = 5.9881e^{-06}$
$p(foot size | male) = 1.3112e^{-3}$
$posterior numerator (male) = 6.1984e^{-09}$
$P(female) = 0.5$
$p(height | female) = 2.2346e^{-1}$
$p(weight | female) = 1.6789e^{-2}$
$p(foot size | female) = 2.8669e^{-1}$
$posterior numerator (female) = 5.3778e^{-04}$