细枝末节算法整理

Naive Bayes

It is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors.

P ( c ∣ x ) = P ( x ∣ c ) P ( c ) P ( x ) P(c|x) = \frac{P(x|c)P(c)}{P(x)} P(cx)=P(x)P(xc)P(c), where c is the target and x is attributes
P ( c ∣ x 1 , ⋯   , x n ) = P ( x 1 ∣ y ) P ( x n ∣ y ) P ( y ) P ( x 1 ) P ( x 2 ) ⋯ P ( x n ) P(c|x_1,\cdots,x_n) = \frac{P(x_1|y)P(x_n|y)P(y)}{P(x_1)P(x_2)\cdots P(x_n)} P(cx1,,xn)=P(x1)P(x2)P(xn)P(x1y)P(xny)P(y)

P ( y ∣ x 1 , ⋯   , x n ) ∝ P ( y ) Π i = 1 n P ( x i ∣ y ) P(y|x_1,\cdots,x_n) \propto P(y)\Pi^n_{i=1}P(x_i|y) P(yx1,,xn)P(y)Πi=1nP(xiy)

y = arg max ⁡ y P ( y ) Π i = 1 n P ( x i ∣ y ) y=\argmax_y P(y)\Pi^n_{i=1}P(x_i|y) y=yargmaxP(y)Πi=1nP(xiy)

AutoEncoder

Basic architecture, An autoencoder has two main parts: an encoder that maps the input into the code, and a decoder that maps the code to a reconstruction of the input.

ϕ : X → F \phi : X \to F ϕ:XF
ψ : F → X \psi : F \to X ψ:FX
ϕ , ψ = arg min ⁡ ϕ , ψ ∣ ∣ X − ( ψ ∘ ϕ ) X ∣ ∣ 2 \phi,\psi = \argmin_{\phi,\psi} ||X-(\psi\circ\phi)X||^2 ϕ,ψ=ϕ,ψargminX(ψϕ)X2

Knowledge Distill

Use a smaller model on production when using a bigger one in training. For multi-class task, we use smaller model to learn the softmax output from the bigger model.
There are lots of tricks for knowledge distill, like tuning the temperature.

Few Shot Learning

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值