分类
信用评分——是否借款
医疗诊断——什么病
字迹识别——什么字
人脸识别——什么人
How
用回归来分类?不合适,离群点会影响回归结果
分类的角度应该是绿色线为分界,用回归的准则为了减小误差会得到紫色线
Generative Model
Pokemon例子:
训练集:79water,61normal
P
(
C
1
)
=
0.56
,
P
(
C
2
)
=
0.44
P(C_1)=0.56,P(C_2)=0.44
P(C1)=0.56,P(C2)=0.44
高斯分布
f μ , Σ ( x ) = 1 ( 2 π ) D / 2 1 ∣ Σ ∣ 1 / 2 exp { − 1 2 ( x − μ ) T Σ − 1 ( x − μ ) } f_{\mu, \Sigma}(x)=\frac{1}{(2 \pi)^{D / 2}} \frac{1}{|\Sigma|^{1 / 2}} \exp \left\{-\frac{1}{2}(x-\mu)^{T} \Sigma^{-1}(x-\mu)\right\} fμ,Σ(x)=(2π)D/21∣Σ∣1/21exp{−21(x−μ)TΣ−1(x−μ)}
根据数据估计均值和协方差,构造高斯分布,极大似然估计
算出
μ
∗
,
Σ
∗
=
arg
max
μ
,
Σ
L
(
μ
,
Σ
)
\mu^{*}, \Sigma^{*}=\arg \max _{\mu, \Sigma} L(\mu, \Sigma)
μ∗,Σ∗=argμ,ΣmaxL(μ,Σ)
μ
∗
=
1
79
∑
n
=
1
79
x
n
Σ
∗
=
1
79
∑
n
=
1
79
(
x
n
−
μ
∗
)
(
x
n
−
μ
∗
)
T
\mu^{*}=\frac{1}{79} \sum_{n=1}^{79} x^{n} \quad \Sigma^{*}=\frac{1}{79} \sum_{n=1}^{79}\left(x^{n}-\mu^{*}\right)\left(x^{n}-\mu^{*}\right)^{T}
μ∗=791n=1∑79xnΣ∗=791n=1∑79(xn−μ∗)(xn−μ∗)T
再将新的点
x
x
x 带入
Probability from Class:
P
(
x
∣
C
1
)
P(x|C_1)
P(x∣C1)
P
(
C
1
∣
x
)
=
P
(
x
∣
C
1
)
P
(
C
1
)
P
(
x
∣
C
1
)
P
(
C
1
)
+
P
(
x
∣
C
2
)
P
(
C
2
)
P\left(C_{1} | x\right)=\frac{P\left(x | C_{1}\right) P\left(C_{1}\right)}{P\left(x | C_{1}\right) P\left(C_{1}\right)+P\left(x | C_{2}\right) P\left(C_{2}\right)}
P(C1∣x)=P(x∣C1)P(C1)+P(x∣C2)P(C2)P(x∣C1)P(C1)
模型修正:共用相同的协方差矩阵(加权一下),减小模型参数,减小过拟合
此时的最大似然函数为
L
(
μ
1
,
μ
2
,
Σ
)
L\left(\mu^{1}, \mu^{2}, \Sigma\right)
L(μ1,μ2,Σ) 均值和之前一样,协方差加权
μ
1
and
μ
2
is the same
Σ
=
79
140
Σ
1
+
61
140
Σ
2
\mu^{1} \text { and } \mu^{2} \text { is the same } \quad \Sigma=\frac{79}{140} \Sigma^{1}+\frac{61}{140} \Sigma^{2}
μ1 and μ2 is the same Σ=14079Σ1+14061Σ2结果变成了线性模型
总结
三步法:
- 模型(概率分布)
- 评价——寻找均值和协方差使得最大化likelihood
- Find the best function
某个样本的各个特征(dimension)都是独立的,就是朴素贝叶斯
后验概率 P ( C 1 ∣ x ) = σ ( z ) P\left(C_{1} | x\right)=\sigma(z) P(C1∣x)=σ(z) z = ln ∣ Σ 2 ∣ 1 / 2 ∣ Σ 1 ∣ 1 / 2 − 1 2 x T ( Σ 1 ) − 1 x + ( μ 1 ) T ( Σ 1 ) − 1 x − 1 2 ( μ 1 ) T ( Σ 1 ) − 1 μ 1 + 1 2 x T ( Σ 2 ) − 1 x − ( μ 2 ) T ( Σ 2 ) − 1 x + 1 2 ( μ 2 ) T ( Σ 2 ) − 1 μ 2 + ln N 1 N 2 \begin{aligned} z=& \ln \frac{\left|\Sigma^{2}\right|^{1 / 2}}{\left|\Sigma^{1}\right|^{1 / 2}}-\frac{1}{2} x^{T}\left(\Sigma^{1}\right)^{-1} x+\left(\mu^{1}\right)^{T}\left(\Sigma^{1}\right)^{-1} x-\frac{1}{2}\left(\mu^{1}\right)^{T}\left(\Sigma^{1}\right)^{-1} \mu^{1} \\ &+\frac{1}{2} x^{T}\left(\Sigma^{2}\right)^{-1} x-\left(\mu^{2}\right)^{T}\left(\Sigma^{2}\right)^{-1} x+\frac{1}{2}\left(\mu^{2}\right)^{T}\left(\Sigma^{2}\right)^{-1} \mu^{2}+\ln \frac{N_{1}}{N_{2}} \end{aligned} z=ln∣Σ1∣1/2∣∣Σ2∣∣1/2−21xT(Σ1)−1x+(μ1)T(Σ1)−1x−21(μ1)T(Σ1)−1μ1+21xT(Σ2)−1x−(μ2)T(Σ2)−1x+21(μ2)T(Σ2)−1μ2+lnN2N1
当协方差矩阵一致时 z = ( μ 1 − μ 2 ) T Σ − 1 w T T − 1 2 ( μ 1 ) T ( Σ 1 ) − 1 μ 1 + 1 2 ( μ 2 ) T ( Σ 2 ) − 1 μ 2 + ln N 1 N 2 b z=\frac{\left(\mu^{1}-\mu^{2}\right)^{T} \Sigma^{-1}}{w^{T}}^{T} \frac{-\frac{1}{2}\left(\mu^{1}\right)^{T}\left(\Sigma^{1}\right)^{-1} \mu^{1}+\frac{1}{2}\left(\mu^{2}\right)^{T}\left(\Sigma^{2}\right)^{-1} \mu^{2}+\ln \frac{N_{1}}{N_{2}}}{\mathrm{b}} z=wT(μ1−μ2)TΣ−1Tb−21(μ1)T(Σ1)−1μ1+21(μ2)T(Σ2)−1μ2+lnN2N1 即 P ( C 1 ∣ x ) = σ ( w ⋅ x + b ) P\left(C_{1} | x\right)=\sigma(w \cdot x+b) P(C1∣x)=σ(w⋅x+b)
由此引出了罗基斯特回归