【NTU_ML_Spring2020】3.Classification

台湾大学 机器学习(李宏毅) 2020年春

分类模型:生成概率模型、朴素贝叶斯、逻辑斯蒂(logistic)回归、多分类Logistic回归、交叉熵损失、柔性最大值(softmax)函数

本文为本人学习过程中整理的学习笔记,想顺带学英语所以用英文呈现。发现错误还烦请指正。欢迎交流。

未经同意,请勿转载。

Classification

Classification as Regression?

Limitations

image-20201031223830015

If using regression method to conduct classification, it will penalize(惩罚) the examples that are “too correct” as the result of considering the sum of all sample’s distances.

Ideal Alternatives

  • Function (Model)
    f ( x ) = { 1 , g ( x ) > 0 0 , otherwise f(x) = \begin{cases} 1, & g(x) > 0 \\ 0, & \text{otherwise} \\ \end{cases} f(x)={ 1,0,g(x)>0otherwise

  • Loss function

    Represent the times f ( x ) f(x) f(x) get incorrect results on training data.
    L ( f ) = ∑ n δ ( f ( x n ) ≠ y ^ n ) L(f) = \sum_n \delta(f(x^n) \ne \hat{y}^n) L(f)=nδ(f(xn)=y^n)

  • Example

    Perceptron, SVM

Generative Model

It’s a probabilistic generative model.

Generative Laws

  1. Bayes Theorem

    Bayes theorem measures the relation between posterior probability and prior probability.

    C i C_i Ci : the target belongs to Class i i i

    x x x : the feature vector of the target

P ( C 1 ∣ x ) = P ( x ∣ C 1 ) P ( C 1 ) P ( x ∣ C 1 ) P ( C 1 ) + P ( x ∣ C 2 ) P ( C 2 ) P(C_1|x) = \frac{P(x|C_1)P(C_1)}{P(x|C_1)P(C_1)+P(x|C_2)P(C_2)} P(C1x)=P(xC1)P(C1)+P(xC2)P(C2)P(xC1)P(C1)

  1. Total Probability Theorem

P ( x ) = P ( x ∣ C 1 ) P ( C 1 ) + P ( x ∣ C 2 ) P ( C 2 ) P(x) = P(x|C_1)P(C_1) + P(x|C_2)P(C_2) P(x)=P(xC1)P(C1)+P(xC2)P(C2)

Distribution

Assume the points are sampled from a specific distribution, which should be selected according to the reality background of the training data.

For instance, if the feature is binary, we may choose Bernoulli distribution. However, if the feature is continuous, maybe we can choose Gaussian distribution instead.

Here we take Gaussian distribution as an example.

Gaussian Distribution

f μ , Σ ( x ) = 1 ( 2 π ) D / 2 ( det ⁡ Σ ) 1 / 2 exp ⁡ { − 1 2 ( x − μ ) T Σ − 1 ( x − μ ) } f_{\mu,\Sigma}(x) = \frac{1}{(2\pi)^{D/2}(\det{\Sigma})^{1/2}} \exp\{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)\} fμ,Σ(x)=(2π)D/2(detΣ)1/21exp{

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值