Theory of Generalization
Restriction of Break Point
- 突破点的限制(让其增长地更慢)
- 对于 N ≥ k N \ge k N≥k的情况, k k k很大程度上限制了 N N N的增长
Bounding Function: Basic Cases
- 界限函数
B
(
N
,
k
)
B(N,k)
B(N,k):当突破点为
k
k
k时,
m
H
(
N
)
m_{\mathcal H}(N)
mH(N)的最大可能值
- 组合数:确保N长度的向量中,任k个维度不出现shatter的情况
- B ( N , k ) B(N,k) B(N,k)与假设集 H \mathcal H H的具体细节无关
Bounding Function: Inductive Cases
- B ( N , k ) = 2 α + β B(N,k) = 2 \alpha + \beta B(N,k)=2α+β
- α + β ≤ B ( N − 1 , k ) \alpha + \beta \le B(N-1, k) α+β≤B(N−1,k)
- α ≤ B ( N − 1 , k − 1 ) \alpha \le B(N-1, k-1) α≤B(N−1,k−1)
- B ( N , k ) ≤ B ( N − 1 , k ) + B ( N − 1 , k − 1 ) B(N, k) \le B(N-1, k) + B(N - 1, k - 1) B(N,k)≤B(N−1,k)+B(N−1,k−1)
- 结论: B ( N , k ) ≤ ∑ i = 0 k − 1 ( N i ) B(N,k) \le \sum_{i=0}^{k-1} {N \choose i} B(N,k)≤∑i=0k−1(iN),实际是可以等于的
A Pictorial Proof
- 理想推导结果
P [ ∃ h ∈ H s.t. ∣ E i n ( h ) − E o u t ( h ) ∣ > ϵ ] ≤ 2 m H ( N ) ⋅ exp ( − 2 ϵ 2 N ) \mathbb P[\exist h \in \mathcal H \ \text{s.t.} |E_{in}(h)-E_{out}(h)| \gt \epsilon] \le 2 m_{\mathcal H}(N) \cdot \exp (-2 \epsilon^2 N) P[∃h∈H s.t.∣Ein(h)−Eout(h)∣>ϵ]≤2mH(N)⋅exp(−2ϵ2N)
- 实际情况下,当N很大时(Vapnik-Chervonenkis,VC上限):
P [ ∃ h ∈ H s.t. ∣ E i n ( h ) − E o u t ( h ) ∣ > ϵ ] ≤ 2 ⋅ 2 m H ( 2 N ) ⋅ exp ( − 2 ⋅ 1 16 ϵ 2 N ) \mathbb P[\exist h \in \mathcal H \ \text{s.t.} |E_{in}(h)-E_{out}(h)| \gt \epsilon] \le 2 \cdot 2 m_{\mathcal H}(2N) \cdot \exp (-2 \cdot \frac{1}{16} \epsilon^2 N) P[∃h∈H s.t.∣Ein(h)−Eout(h)∣>ϵ]≤2⋅2mH(2N)⋅exp(−2⋅161ϵ2N)
- 由VC上限可知,二维感知器的学习过程是可行的