细枝末节算法整理-CSDN博客

本文链接：https://blog.csdn.net/qq_17377865/article/details/118378463

算法索引

Naive Bayes

It is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors.

$\frac{P(x|c)P(c)}{P(x)}$ , where c is the target and x is attributes
$P(c|x_1,\cdots,x_n) = \frac{P(x_1|y)P(x_n|y)P(y)}{P(x_1)P(x_2)\cdots P(x_n)}$

$P(y|x_1,\cdots,x_n) \propto P(y)\Pi^n_{i=1}P(x_i|y)$

$y=\argmax_y P(y)\Pi^n_{i=1}P(x_i|y)$

AutoEncoder

Basic architecture, An autoencoder has two main parts: an encoder that maps the input into the code, and a decoder that maps the code to a reconstruction of the input.

$\phi : X \to F$
$\psi : F \to X$
$\phi,\psi = \argmin_{\phi,\psi} ||X-(\psi\circ\phi)X||^2$

Knowledge Distill

Use a smaller model on production when using a bigger one in training. For multi-class task, we use smaller model to learn the softmax output from the bigger model.
There are lots of tricks for knowledge distill, like tuning the temperature.