![](https://img-blog.csdnimg.cn/20190927151124774.png?x-oss-process=image/resize,m_fixed,h_224,w_224)
机器学习 核心推导
文章平均质量分 96
介绍主流机器学习的算法思路分析及数学推导
Jay_Tang
小唐的 ML & NLP 进阶之路
展开
-
往期文章集合目录
Logistic Regression, L1, L2 regularization, Gradient/Coordinate descent 详细MLE v.s. MAP ~ L1, L2 Math Derivation 详细XGBoost math Derivation 通俗易懂的详细推导Introduction to Convex Optimization Basic Concept...原创 2020-04-14 00:44:41 · 1640 阅读 · 0 评论 -
Decoupling Representation and Classifier for Long-Tailed Recognition 图像领域长尾分布分类问题方法
文章目录IntroductionRecent DirectionsSampling StrategiesMethods of Learning ClassifiersClassifier Re-training (cRT)Nearest Class Mean classifier (NCM)τ\tauτ-normalized classifier (τ(\tau(τ-normalized)ExperimentsDatasetsEvaluation ProtocolResultsSampling matter原创 2021-01-08 11:48:24 · 1089 阅读 · 0 评论 -
Knowledge Distillation 知识蒸馏详解
文章目录往期文章链接目录Shortcoming of normal neural networksGeneralization of InformationKnowledge DistillationA few DefinitionsGeneral idea of knowledge distillationTeacher and StudentTemperature & EntropyTraining the Distil Model往期文章链接目录往期文章链接目录Currently, esp原创 2020-08-05 06:59:54 · 2374 阅读 · 0 评论 -
Intro to Deep Learning & Backpropagation 深度学习模型介绍及反向传播算法推导详解
文章目录Deep Neural Network往期文章链接目录Forward PropagationLoss functions of neural networkBack-propagationcompute ∂ℓ∂f(x)\frac{\partial \ell}{\partial f(x)}∂f(x)∂ℓcompute ∂ℓ∂a(L+1)(x)\frac{\partial \ell}{\partial a^{(L+1)}(x)}∂a(L+1)(x)∂ℓcompute ∂ℓ∂h(k)(x)\frac原创 2020-05-26 04:29:34 · 584 阅读 · 0 评论 -
Log-Linear Model & CRF 条件随机场详解
文章目录往期文章链接目录Log-Linear modelConditional Random Fields (CRF)Formal definition of CRFLog-linear model to linear-CRFInference problem for CRFLearning problem for CRFLearning problem for general Log-Linear modelLearning problem for CRFCompute Z(xˉ,w)Z(\bar x,原创 2020-05-19 13:15:11 · 802 阅读 · 0 评论 -
Hidden Markov Model (HMM) 详细推导及思路分析
往期文章链接目录Before reading this post, you should be familiar with the EM Algorithm and decent among of knowledge of convex optimization. If not, check out my previous postEM Algorithmconvex optimiz...原创 2020-05-03 03:32:13 · 1617 阅读 · 1 评论 -
Probabilistic Graphical Model (PGM) 概率图模型框架详解
往期文章链接目录Probabilistic Graphical Model (PGM)Definition: A probabilistic graphical model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables.In general, PGM obeys following rules:Sum Rul原创 2020-05-11 02:50:37 · 2750 阅读 · 2 评论 -
GMM & K-means 高斯混合模型和K-means聚类详解
往期文章链接目录文章目录往期文章链接目录Gaussian mixture model (GMM)Interpretation from geometryInterpretation from mixture modelGMM Derivationset upSolve by MLESolve by EM AlgorithmK-means往期文章链接目录Gaussian mixture model (GMM)A Gaussian mixture model is a probabilistic mode原创 2020-05-16 08:18:16 · 1187 阅读 · 0 评论 -
EM (Expectation–Maximization) Algorithm 思路分析及推导
往期文章链接目录Jensen’s inequalityTheorem: Let fff be a convex function, and let XXX be a random variable. Then:E[f(X)]≥f(E[X])E[f(X)] \geq f(E[X])E[f(X)]≥f(E[X])\quad Moreover, if fff is strictly con...原创 2020-04-24 05:38:18 · 958 阅读 · 3 评论 -
SVM/ Dual SVM math derivation, non-linear SVM, kernel function详细
Linear SVMIdea:We want to find a hyper-plane w⊤x+b=0w^\top x + b = 0w⊤x+b=0 that maximizes the margin.Set up:We first show that the vector www is orthogonal to this hyper-plane. Let x1x_1x1, x2x...原创 2020-03-28 00:51:02 · 985 阅读 · 0 评论 -
Convex Optimization: Primal Problem to Dual problem clearly explained 详细
Consider an optimization problem in the standard form (we call this a primal problem):We denote the optimal value of this as p⋆p^\starp⋆. We don’t assume the problem is convex.The Lagrange dual fun...原创 2020-03-27 23:57:28 · 1291 阅读 · 0 评论 -
Introduction to Convex Optimization Basic Concepts 详细
Optimization problemAll optimization problems can be written as:Optimization Categoriesconvex v.s. non-convexDeep Neural Network is non-convexcontinuous v.s.discreteMost are continuous vari...原创 2020-03-27 23:15:27 · 510 阅读 · 0 评论 -
XGBoost math Derivation 通俗易懂的详细推导
Bagging v.s. Boosting:Bagging:Leverages unstable base learners that are weak because of overfitting.Boosting:Leverages stable base learners that are weak because of underfitting.XGBoostLearning ...原创 2020-03-27 14:19:53 · 628 阅读 · 0 评论 -
MLE, MAP 对比及 MAP 转换到 L1, L2 norm 的 Math Derivation 详细
MLE v.s. MAPMLE: learn parameters from data.MAP: add a prior (experience) into the model; more reliable if data is limited. As we have more and more data, the prior becomes less useful.As data inc...原创 2020-03-27 13:14:04 · 451 阅读 · 0 评论 -
Logistic Regression, L1, L2 regularization, Gradient/Coordinate descent 详细
Generative model v.s. Discriminative model:Examples:Generative model: Naive Bayes, HMM, VAE, GAN.Discriminative model:Logistic Regression, CRF.Obejective function:Generative model: max p (x,y...原创 2020-03-27 12:39:18 · 908 阅读 · 0 评论