机器学习基石笔记（六）：泛化理论

最新推荐文章于 2024-02-22 11:29:32 发布

夜是故乡明

最新推荐文章于 2024-02-22 11:29:32 发布

阅读量939

点赞数 1

分类专栏：机器学习机器学习基石文章标签：机器学习基石详解笔记

本文链接：https://blog.csdn.net/the_harder_to_love/article/details/89446631

版权

机器学习同时被 2 个专栏收录

12 篇文章 0 订阅

订阅专栏

机器学习基石

8 篇文章 2 订阅

订阅专栏

文章目录

Lecture 6: Theory of Generalization

Lecture 6: Theory of Generalization

Restriction of Break Point

The Four Break Points

N=3, K=2 Break Point

$\begin{aligned} & m_{\mathcal{H}}(N) \\ \leq & \text { maximum possible } m_{\mathcal{H}}(N) \text { given } k \\ \leq & p o l y(N) \end{aligned}$

Fun Time

When minimum break point k = 1, what is the maximum possible $m_{\mathcal{H}}(N)$ when $N = 3$ ？
1. 1 $\checkmark$ 2. 2 3. 3 4. 4

Explanation
因为 $k = 1$ ，所以没有任何一个点可以和它共存，所以 $m_H (N) = 1$

Bounding Function: Basic Cases

Bounding Function

bounding function $B (N, k)$ :
maximum possible $m_H (N)$ when break point = k
$\leq p o l y(N)$

换言之， $B (N, k)$ 是 $m_H (N)$ 的上界。

Table of Bounding Function

Fun Time

For the 2D perceptrons, which of the following claim is true?
1 minimum break point k = 2
2 $m_{\mathcal{H}}(4)$ = 15
3 $m_{\mathcal{H}}(N)<B(N, k)$ when $N = k = $ minimum break point $\checkmark$
4 $m_{\mathcal{H}}(N)>B(N, k)$ when $N = k = $ minimum break point

Explanation
minimum break point k = 3
$m_{\mathcal{H}}(4)$ = 14
$B (N, k)$ 是 $m_H (N)$ 的上界
不记得2D感知器的同学，可以回顾Lecture 5: Training versus Testing中的Effective Number of Hypotheses ?

Bounding Function: Inductive Cases

$\alpha+\beta$
Instance Estimating Part

$\begin{aligned} B(N, k) &=2 \alpha+\beta \\ \alpha+\beta & \leq B(N-1, k) \\ \alpha & \leq B(N-1, k-1) \\ \Rightarrow B(N, k) & \leq B(N-1, k)+B(N-1, k-1) \end{aligned} \\ B(N, k) \leq \sum_{i=0}^{k-1} \left( \begin{array}{c}{N} \\ {i}\end{array}\right)$
The Upper Bound of Bounding Function

$\le$ 实际上是 $=$
即
$\\ B(N, k) = \sum_{i=0}^{k-1} \left( \begin{array}{c}{N} \\ {i}\end{array}\right) = C_N^0+C_N^1 +...+C_N^{k-1}$

The Three Break Points

2D perceptrons break point at 4， $m_{\mathcal{H}}(N) \leq B(N, 4) = \frac{1}{6} N^{3}+\frac{5}{6} N+1 = O(N^3)$

Fun Time

For 1D perceptrons (positive and negative rays), we know that $m_H (N)$ = 2N. Let k be the minimum break point. Which of the following is not true?
1 k = 3
2 for some integers $,\ m_{\mathcal{H}}(N)=\sum_{i=0}^{k-1} \left( \begin{array}{c}{N} \\ {i}\end{array}\right)$
3 for all integers $,\ m_{\mathcal{H}}(N)=\sum_{i=0}^{k-1} \left( \begin{array}{c}{N} \\ {i}\end{array}\right)$ $\checkmark$
4 for all integers $,\ m_{\mathcal{H}}(N)<\sum_{i=0}^{k-1} \left( \begin{array}{c}{N} \\ {i}\end{array}\right)$

Explanation
minimum break point k = 3
$\sum_{i=0}^{k-1} \left( \begin{array}{c}{N} \\ {i}\end{array}\right)$
$B (N, k)$ 是 $m_H (N)$ 的上界，当N $\ge$ k时， $m_H (N)<B(N, k)$ ; 当N $<$ k时， $m_H (N)=B(N, k)$ .

拓展：回顾下Lecture 5: Training versus Testing中的Effective Number of Hypotheses Funtime
求2维感知器中5个点的有效分类数(k=3,N=5 $m_{\mathcal{H}}(N)=? \leq \frac{1}{6} N^{3}+\frac{5}{6} N+1$ )，N>k，=取不到。
正确答案22<( $\frac{125}{6}+\frac{25}{6}+1=25$ )，验证成功，回顾题目也挺有趣味的。?

A Pictorial Proof

Step 1: Replace E_out by E_in'

用 $E_{in}'$ （有限）替换 $E_{out}$ (无限)，但是这个不等式及 $\frac{1}{2}$ 的系数的出处，我没想明白。

Step 2: Decompose H by Kind

将上界定义为以 $m_{H}(2N)$ 为基准的。

Step 3: Use Hoeffding without Replacement

使用无放回的霍夫丁不等式，结果类似，只是 $\nu=E_{\text { in }},\mu=\frac{E_{\text { in }}+E_{\text { in }}^{\prime}}{2}$ 。

Vapnik-Chervonenkis (VC) bound

$\begin{aligned} & \mathbb{P}\left[\exists h \in \mathcal{H} \text { s.t. } | E_{\text { in }}(h)-E_{\text { out }}(h) |>\epsilon\right] \\ & \leq 4 m_{\mathcal{H}}(2 N) \exp \left(-\frac{1}{8} \epsilon^{2} N\right) \end{aligned}$
$m_H (N)$ can replace M with a few changes

Fun Time

For positive rays, $m_H (N) = N + 1$ . Plug it into the VC bound for ? = 0.1 and N = 10000. What is VC bound of BAD events?
$\mathbb{P}\left[\exists h \in \mathcal{H} \text { s.t. } | E_{\text { in }}(h)-E_{\text { out }}(h) |>\epsilon\right] \leq 4 m_{\mathcal{H}}(2 N) \exp \left(-\frac{1}{8} \epsilon^{2} N\right)$
1 $2.77 × 10^{−87}$
2 $5.54 × 10^{−83}$
3 $2.98 × 10^{−1}$ $\checkmark$
4 $2.29 × 10^{−2}$

Explanation
代入公式计算即可。
0.2981471603789822