MLE 机器学习(施工中)

最新推荐文章于 2024-02-22 11:29:32 发布

三轮银

最新推荐文章于 2024-02-22 11:29:32 发布

阅读量661

点赞数

文章标签：机器学习

本文链接：https://blog.csdn.net/jasonasuka/article/details/115917444

版权

Supervised Learning监督学习

regression problems
classification problems

Regression：

集合set D，N个点，输入x，输出t
$D =$ { $x_n,t_n):n = 1, ..., N$ }
goal: is to predict the output t for a test, as of yet unobserved, input x. 在这里插入图片描述

Classification:

集合set D，N个点，输入x，输出t
$D =$ { $x_n,t_n):n = 1, ..., N$ }
在这里插入图片描述

Supervised Learning:

Memorizing vs. learning:记忆指记住哪个x对应哪个t。学习指用x和t的对应找到普世的关系

No Free Lunch Theorem无免费午餐：

如果我们不对特征空间有先验假设，则所有算法的平均表现是一样的。
Inductive Bias 归纳偏置：在机器学习中，很多学习算法经常会对学习的问题做一些假设，这些假设就称为归纳偏置

Frequentist Supervised Learning 频率监督学习：

Based on the training set D, learning aims at deriving a hard predictor $t ˆ D (x)$ or a soft predictor $q D (t ∣ x)$ , where the subscript is used to emphasize the dependence on D.

Inference vs Learning：
When the population distribution p(x,t) is known, we don’t need data D and we have a standard inference problem, as studied in the previous chapter.
When the distribution p(x,t) is not known or not tractable, we have a learning problem.

EXAMPLE：

在这里插入图片描述

Supervised Training of Deterministic Models: ERM 经验风险最小化

对于上一题的example，我们把他当成输出t是许多x的n阶的函数相加。
假设一个model class H，hard predictors为 $t ˆ (\cdot ∣ θ)$ , θ是 set Θ的vector。（就是系数拉！）
H表示为
在这里插入图片描述
hard predictors表示为

定义特征向量 a feature vector:
在这里插入图片描述
定义模型参数向量model parameter vector：

其中M越多越难学习。越容易过拟合

Loss Function

Loss Function $l (t, t ˆ (\cdot ∣ θ))$ 用来评测 hard predictor $t ˆ (\cdot ∣ θ)$ 的优劣。
對於regression，常用

在这里插入图片描述
对于classification

Population Loss

目的是减少Population Loss，这取决于model parameter vector θ
在这里插入图片描述
In the context of learning, the population loss is also known as generalization or out-of-sample loss, since it can be interpreted as the average loss measured on an independently generated test pair (x,t) ∼ p(x,t). Unlike inference, however, we do not know p(x,t)!

TrainingLoss ？

在这里插入图片描述

The training loss measures the empirical average of the loss accrued by the predictor tˆ(·|θ) on the examples of the training set. As such, the training loss LD(θ) is an estimate of the population loss Lp(θ) based on the training data set. Note that this estimate is just that, and hence we generally have LD(θ) != Lp(θ).

Law of Large Numbers ？

Empirical Risk Minimization (ERM)

已知training loss
在这里插入图片描述
ERM要让training loss最小化

比較熟悉单词：

distribution分布
potentially 潜在的
infinite 无限
Inference推论
【var】variance 方差
【arg】argument 自变量

没见过单词：

名词：

hard/soft predictor 软给概率硬直接出答案
quadratic loss : loss function 的一种，开平方
joint distribution 联合分布（当a情况时，b的概率）
true distribution
population distribution
empirical distribution 经验分布
在这里插入图片描述

【EDF】Empirical Distribution Functions经验分布函数
在这里插入图片描述
【CDF】cumulative distribution function 累积分布函数

i.i.d (independent and identically distributed) 独立且均匀分布的随机变量？
prior/posterior 前验/后验（分布）？
empirial risk 经验风险
ERM 经验风险最小化
inductive bias 归纳偏置（对学习的问题做一些假设）
Population Loss ？
Training Loss ？

网站参考：

https://www.cnblogs.com/shengyang17/p/11037582.html

三轮银

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
MLE 机器学习(施工中)

Supervised Learning监督学习regression problemsclassification problemsRegression：集合set D，N个点，输入x，输出tD=D =D= {(xn,tn):n=1,...,N(x_n,t_n):n = 1, ..., N(xn,tn):n=1,...,N}goal: is to predict the output t for a test, as of yet unobserved, input x.Classifi
复制链接

扫一扫