AdaBoost实例

试用AdaBoost算法学习一个强分类器

训练数据集

序号12345678910
x0123456789
y111-1-1-1111-1

解:
初始化数据权值分布
D 1 = ( w 1 , 1 , w 1 , 2 , … , w 1 , 10 ) w 1 , i = 0.1 , i = 1 , 2 , … , 10 D_1=(w_{1,1},w_{1,2},\dots,w_{1,10})\\ w_{1,i}=0.1,i=1,2,\dots,10 D1=(w1,1,w1,2,,w1,10)w1,i=0.1,i=1,2,,10
对于 m = 1 m=1 m=1,
  (a)在权值分布为 D 1 D_1 D1的训练数据上,计算阈值 ν \nu ν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率,

序号123456789
ν \nu ν0.51.52.53.54.55.56.57.58.5
分类误差率0.50.40.30.40.50.40.50.40.3

阈值取 ν = 8.5 \nu=8.5 ν=8.5时分类误差率最低,故基本分类器为
G 1 ( x ) = { 1 , x &lt; 8.5 − 1 , x ≥ 8.5 G_1(x)= \begin{cases} 1,&amp;x\lt8.5 \\ -1,&amp;x\ge8.5 \end{cases} G1(x)={1,1,x<8.5x8.5
  (b) G 1 ( x ) G_1(x) G1(x)在训练数据集上的误差率 e 1 = P ( G 1 ( x i ) ≠ y i ) = 0.3 e_1=P(G_1(x_i)\neq y_i) =0.3 e1=P(G1(xi)̸=yi)=0.3
  ©计算 G 1 ( x ) G_1(x) G1(x)的系数: α 1 = 1 2 l o g 1 − e 1 e 1 = 0.4236 \alpha_1=\dfrac{1}{2}log\dfrac{1-e_1}{e_1}=0.4236 α1=21loge11e1=0.4236
  (d)更新训练数据的权值分布:
D 2 = ( w 2 , 1 , w 2 , 2 , … , w 2 , 10 ) D_2=(w_{2,1},w_{2,2},\dots,w_{2,10}) D2=(w2,1,w2,2,,w2,10)
w 2 , i = w 1 , i Z 1 e x p ( − α 1 y i G 1 ( x i ) ) , i = 1 , 2 , … , 10 w_{2,i} = \dfrac{w_{1,i}}{Z_1}exp(-\alpha_1y_iG_1(x_i)),i=1,2,\dots,10 w2,i=Z1w1,iexp(α1yiG1(xi)),i=1,2,,10
D 2 = ( 0.07142857 , 0.07142857 , 0.07142857 , 0.16666667 , 0.16666667 , 0.16666667 , 0.07142857 , 0.07142857 , 0.07142857 , 0.07142857 ) D_2=(0.07142857,0.07142857,0.07142857,0.16666667,0.16666667,0.16666667,0.07142857,0.07142857,0.07142857,0.07142857) D2=(0.07142857,0.07142857,0.07142857,0.16666667,0.16666667,0.16666667,0.07142857,0.07142857,0.07142857,0.07142857)
f 1 ( x ) = α 1 G 1 ( x ) = 0.4236 G 1 ( x ) f_1(x)=\alpha_1G_1(x)=0.4236G_1(x) f1(x)=α1G1(x)=0.4236G1(x)
  (e)分类器 s i g n [ f 1 ( x ) ] sign[f_1(x)] sign[f1(x)]在训练数据集上有3个误分点

序号12345678910
G 1 ( x ) G_1(x) G1(x)111111111-1
f 1 ( x ) f_1(x) f1(x)0.42360.42360.42360.42360.42360.42360.42360.42360.4236-0.4236
s i g n [ f 1 ( x ) ] sign[f_1(x)] sign[f1(x)]111111111-1
y111-1-1-1111-1

m = 2 m=2 m=2,
  (a)在权值分布为 D 2 D_2 D2的训练数据上,计算阈值 ν \nu ν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率, e m = ∑ G m ( x i ) ≠ y i w m i e_m=\sum_{G_m(x_i)\neq y_i} w_{mi} em=Gm(xi)̸=yiwmi

序号123456789
ν \nu ν0.51.52.53.54.55.56.57.58.5
分类误差率0.3570.2860.2140.3810.4520.2860.3580.4290.5

阈值取 ν = 2.5 \nu=2.5 ν=2.5时分类误差率最低,故基本分类器为
G 2 ( x ) = { 1 , x &lt; 2.5 − 1 , x ≥ 2.5 G_2(x)= \begin{cases} 1,&amp;x\lt2.5 \\ -1,&amp;x\ge2.5 \end{cases} G2(x)={1,1,x<2.5x2.5
  (b) G 2 ( x ) G_2(x) G2(x)在训练数据集上的误差率 e 2 = P ( G 2 ( x i ) ≠ y i ) = 0.214 e_2=P(G_2(x_i)\neq y_i) =0.214 e2=P(G2(xi)̸=yi)=0.214
  ©计算 G 2 ( x ) G_2(x) G2(x)的系数: α 2 = 1 2 l o g 1 − e 2 e 2 = 0.6496 \alpha_2=\dfrac{1}{2}log\dfrac{1-e_2}{e_2}=0.6496 α2=21loge21e2=0.6496
  (d)更新训练数据的权值分布:
D 3 = ( w 3 , 1 , w 3 , 2 , … , w 3 , 10 ) D_3=(w_{3,1},w_{3,2},\dots,w_{3,10}) D3=(w3,1,w3,2,,w3,10)
w 3 , i = w 2 , i Z 1 e x p ( − α 2 y i G 2 ( x i ) ) , i = 1 , 2 , … , 10 w_{3,i} = \dfrac{w_{2,i}}{Z_1}exp(-\alpha_2y_iG_2(x_i)),i=1,2,\dots,10 w3,i=Z1w2,iexp(α2yiG2(xi)),i=1,2,,10
D 3 = ( 0.04545452 , 0.04545452 , 0.04545452 , 0.10606056 , 0.10606056 , 0.10606056 , 0.16666675 , 0.16666675 , 0.16666675 , 0.04545452 ) D_3=(0.04545452,0.04545452,0.04545452,0.10606056,0.10606056,0.10606056, 0.16666675,0.16666675,0.16666675,0.04545452) D3=(0.04545452,0.04545452,0.04545452,0.10606056,0.10606056,0.10606056,0.16666675,0.16666675,0.16666675,0.04545452)
f 2 ( x ) = 0.4236 G 1 ( x ) + 0.6496 G 2 ( x ) f_2(x)=0.4236G_1(x) + 0.6496G_2(x) f2(x)=0.4236G1(x)+0.6496G2(x)
  (e)分类器 s i g n [ f 2 ( x ) ] sign[f_2(x)] sign[f2(x)]在训练数据集上有3个误分点

序号12345678910
G 1 ( x ) G_1(x) G1(x)111111111-1
G 2 ( x ) G_2(x) G2(x)111-1-1-1-1-1-1-1
α 1 G 1 ( x ) \alpha_1G_1(x) α1G1(x)0.42360.42360.42360.42360.42360.42360.42360.42360.4236-0.4236
α 2 G 2 ( x ) \alpha_2G_2(x) α2G2(x)0.64960.64960.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496
s i g n [ f 2 ( x ) ] sign[f_2(x)] sign[f2(x)]111-1-1-1-1-1-1-1
y111-1-1-1111-1

m = 3 m=3 m=3
  (a)在权值分布为 D 3 D_3 D3的训练数据上,计算阈值 ν \nu ν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率, e m = ∑ G m ( x i ) ≠ y i w m i e_m=\sum_{G_m(x_i)\neq y_i} w_{mi} em=Gm(xi)̸=yiwmi

序号123456789
ν \nu ν0.51.52.53.54.55.56.57.58.5
分类误差率0.4090.4550.50.3940.2880.1820.3480.4850.318

阈值取 ν = 5.5 \nu=5.5 ν=5.5时分类误差率最低,故基本分类器为
G 2 ( x ) = { − 1 , x &lt; 5.5 1 , x ≥ 5.5 G_2(x)= \begin{cases} -1,&amp;x\lt5.5 \\ 1,&amp;x\ge5.5 \end{cases} G2(x)={1,1,x<5.5x5.5
  (b) G 3 ( x ) G_3(x) G3(x)在训练数据集上的误差率 e 3 = P ( G 3 ( x i ) ≠ y i ) = 0.7520 e_3=P(G_3(x_i)\neq y_i) =0.7520 e3=P(G3(xi)̸=yi)=0.7520
  (d)更新训练数据的权值分布:
D 4 = ( w 4 , 1 , w 4 , 2 , … , w 4 , 10 ) D_4=(w_{4,1},w_{4,2},\dots,w_{4,10}) D4=(w4,1,w4,2,,w4,10)
w 4 , i = w 3 , i Z 1 e x p ( − α 3 y i G 3 ( x i ) ) , i = 1 , 2 , … , 10 w_{4,i} = \dfrac{w_{3,i}}{Z_1}exp(-\alpha_3y_iG_3(x_i)),i=1,2,\dots,10 w4,i=Z1w3,iexp(α3yiG3(xi)),i=1,2,,10
D 4 = ( 0.125 , 0.125 , 0.125 , 0.06481478 , 0.06481478 , 0.06481478 , 0.10185189 , 0.10185189 , 0.10185189 , 0.125 ) D_4=(0.125,0.125,0.125,0.06481478,0.06481478,0.06481478, 0.10185189,0.10185189,0.10185189,0.125) D4=(0.125,0.125,0.125,0.06481478,0.06481478,0.06481478,0.10185189,0.10185189,0.10185189,0.125)
f 3 ( x ) = 0.4236 G 1 ( x ) + 0.6496 G 2 ( x ) + 0.7520 G 3 ( x ) f_3(x)=0.4236G_1(x) + 0.6496G_2(x)+0.7520G_3(x) f3(x)=0.4236G1(x)+0.6496G2(x)+0.7520G3(x)
  (e)分类器 s i g n [ f 3 ( x ) ] sign[f_3(x)] sign[f3(x)]在训练数据集上有0个误分点

序号12345678910
G 1 ( x ) G_1(x) G1(x)111111111-1
G 2 ( x ) G_2(x) G2(x)111-1-1-1-1-1-1-1
G 3 ( x ) G_3(x) G3(x)-1-1-1-1-1-11111
α 1 G 1 ( x ) \alpha_1G_1(x) α1G1(x)0.42360.42360.42360.42360.42360.42360.42360.42360.4236-0.4236
α 2 G 2 ( x ) \alpha_2G_2(x) α2G2(x)0.64960.64960.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496
α 3 G 3 ( x ) \alpha_3G_3(x) α3G3(x)-0.7520-0.7520-0.7520-0.7520-0.7520-0.75200.75200.75200.75200.7520
s i g n [ f 3 ( x ) ] sign[f_3(x)] sign[f3(x)]111-1-1-1111-1
y111-1-1-1111-1

于是最终的分类器为
G ( x ) = s i g n [ f 3 ( x ) ] = 0.4236 G 1 ( x ) + 0.6496 G 2 ( x ) + 0.7520 G 3 ( x ) G(x)=sign[f_3(x)]=0.4236G_1(x) + 0.6496G_2(x)+0.7520G_3(x) G(x)=sign[f3(x)]=0.4236G1(x)+0.6496G2(x)+0.7520G3(x)

  • 0
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值