统计学习方法 第七章习题答案

习题7.1

题目:
比较感知机的对偶形式与线性可分支持向量机的对偶形式.
解答:
感知机:
原始形式
min ⁡ w , b L ( w , b ) = ∑ i = 1 N [ − y i ( w ⋅ x i + b ) ] + \min _{w, b} L(w, b)=\sum_{i=1}^{N}\left[-y_{i}\left(w \cdot x_{i}+b\right)\right]_{+} minw,bL(w,b)=i=1N[yi(wxi+b)]+
对偶形式
min ⁡ w , b L ( w , b ) = min ⁡ α i L ( α i ) = ∑ i = 1 N ( − y i ( ∑ j = 1 N α j y j x j ⋅ x i + ∑ j = 1 N α j y j ) ) \min _{w, b} L(w, b)=\min _{\alpha_{i}} L\left(\alpha_{i}\right)=\sum_{i=1}^{N}\left(-y_{i}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x_{i}+\sum_{j=1}^{N} \alpha_{j} y_{j}\right)\right) minw,bL(w,b)=minαiL(αi)=i=1N(yi(j=1Nαjyjxjxi+j=1Nαjyj))
由对偶形式可以求到 w , b w,b w,b
w = ∑ i = 1 N α i y i x i , b = ∑ i = 1 N α i y i w=\sum_{i=1}^{N} \alpha_{i} y_{i} x_{i,} \\b=\sum_{i=1}^{N} \alpha_{i} y_{i} w=i=1Nαiyixi,b=i=1Nαiyi
线性可分支持向量机:
原始形式:
min ⁡ w , b 1 2 ∥ w ∥ 2  s.t.  y i ( w ⋅ x i + b ) − 1 ⩾ 0 , i = 1 , 2 , ⋯   , N \begin{array}{ll}\min _{w, b} & \frac{1}{2}\|w\|^{2} \\ \text { s.t. } & y_{i}\left(w \cdot x_{i}+b\right)-1 \geqslant 0, \quad i=1,2, \cdots, N\end{array} minw,b s.t. 21w2yi(wxi+b)10,i=1,2,,N
对偶形式:
min ⁡ α 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j ( x i ⋅ x j ) − ∑ i = 1 N α i  s.t.  ∑ i = 1 N α i y i = 0 α i ⩾ 0 , i = 1 , 2 , ⋯   , N \begin{array}{cl}\min _{\alpha} & \frac{1}{2} \sum_{i=1}^{N} \sum_{j=1}^{N} \alpha_{i} \alpha_{j} y_{i} y_{j}\left(x_{i} \cdot x_{j}\right)-\sum_{i=1}^{N} \alpha_{i} \\ \text { s.t. } & \sum_{i=1}^{N} \alpha_{i} y_{i}=0 \\ & \alpha_{i} \geqslant 0, \quad i=1,2, \cdots, N\end{array} minα s.t. 21i=1Nj=1Nαiαjyiyj(xixj)i=1Nαii=1Nαiyi=0αi0,i=1,2,,N
由对偶形式可以求到 w , b w,b w,b
w ∗ = ∑ i = 1 N α i ∗ y i x i w^{*}=\sum_{i=1}^{N} \alpha_{i}^{*} y_{i} x_{i} w=i=1Nαiyixi
b ∗ = y j − ∑ i = 1 N α i ∗ y i ( x i ⋅ x j ) b^{*}=y_{j}-\sum_{i=1}^{N} \alpha_{i}^{*} y_{i}\left(x_{i} \cdot x_{j}\right) b=yji=1Nαiyi(xixj)

习题7.2

题目:
已知正例点 x 1 = ( 1 , 2 ) T , x 2 = ( 2 , 3 ) T , x 3 = ( 3 , 3 ) T x_1=(1,2)^{T}, x_2=(2,3)^{T}, x_3=(3,3)^{T} x1=(1,2)T,x2=(2,3)T,x3=(3,3)T,负例点 x 4 = ( 2 , 1 ) T , x 5 = ( 3 , 2 ) T x_4=(2,1)^{T}, x_5=(3,2)^{T} x4=(2,1)T,x5=(3,2)T,试求最大间隔分离超平面和分类决策函数,并在图上画出分离超平面、间隔边界及支持向量.
解答:
原始形式
min ⁡ 1 2 ∥ w 1 2 + w 2 2 ∥ \min \frac{1}{2}\left\|w_{1}^{2}+w_{2}^{2}\right\| min21w12+w22
s.t. w 1 + 2 w 2 + b ≥ 1... ( 1 ) \quad w_{1}+2 w_{2}+b \geq 1...(1) w1+2w2+b1...(1)
2 w 1 + 3 w 2 + b ≥ 1... ( 2 ) 2 w_{1}+3 w_{2}+b \geq 1...(2) 2w1+3w2+b1...(2)
3 w 1 + 3 w 2 + b ≥ 1... ( 3 ) 3 w_{1}+3 w_{2}+b \geq 1...(3) 3w1+3w2+b1...(3)
− 2 w 1 − w 2 − b ≥ 1... ( 4 ) -2 w_{1}-w_{2}-b \geq 1...(4) 2w1w2b1...(4)
− 3 w 1 − 2 w 2 − b ≥ 1... ( 5 ) -3 w_{1}-2 w_{2}-b \geq 1...(5) 3w12w2b1...(5)
化简一下有
( 1 ) + ( 4 ) : − w 1 + w 2 ≥ 2 (1)+(4): -w_{1}+w_{2} \geq 2 (1)+(4):w1+w22
( 1 ) + ( 5 ) : − 2 w 1 ≥ 2 (1)+(5): -2w_{1} \geq 2 (1)+(5):2w12
( 2 ) + ( 4 ) : 2 w 2 ≥ 2 (2)+(4): 2w_{2}\geq 2 (2)+(4):2w22
( 3 ) + ( 4 ) : w 1 + 2 w 2 ≥ 2 (3)+(4): w_{1}+2w_{2} \geq 2 (3)+(4):w1+2w22
( 3 ) + ( 5 ) : w 2 ≥ 2 (3)+(5): w_{2} \geq 2 (3)+(5):w22
得到的5个方程,使用高中知识数学规划,画一下关于 w 1 , w 2 w_{1},w_{2} w1,w2坐标系,发现 w 1 2 + w 2 2 w_{1}^{2}+w_{2}^{2} w12+w22最小是在 w 1 = − 1 , w 2 = 2 w_{1}=-1, w_{2} = 2 w1=1,w2=2
将这个值带入原来的(1)-(5)方程,可以得到 b = − 2 b = -2 b=2
(图就不画了(✺ω✺))

习题7.3

题目:
线性支持向量机还可以定义为以下形式:
min ⁡ w , b , ξ 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2  s.t.  y i ( w ⋅ x i + b ) ⩾ 1 − ξ i , i = 1 , 2 , ⋯   , N ξ i ⩾ 0 , i = 1 , 2 , ⋯   , N \begin{array}{ll}\min _{w, b, \xi} & \frac{1}{2}\|w\|^{2}+C \sum_{i=1}^{N} \xi_{i}^{2} \\ \text { s.t. } & y_{i}\left(w \cdot x_{i}+b\right) \geqslant 1-\xi_{i}, \quad i=1,2, \cdots, N \\ & \xi_{i} \geqslant 0, \quad i=1,2, \cdots, N\end{array} minw,b,ξ s.t. 21w2+Ci=1Nξi2yi(wxi+b)1ξi,i=1,2,,Nξi0,i=1,2,,N
试求其对偶形式.
解答:
与课本110页的推导类似
这个形式的拉格朗日函数为
L ( w , b , ξ , α , μ ) ≡ 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2 − ∑ i = 1 N α i ( y i ( w ⋅ x i + b ) − 1 + ξ i ) − ∑ i = 1 N μ i ξ i L(w, b, \xi, \alpha, \mu) \equiv \frac{1}{2}\|w\|^{2}+C \sum_{i=1}^{N} \xi_{i}^2-\sum_{i=1}^{N} \alpha_{i}\left(y_{i}\left(w \cdot x_{i}+b\right)-1+\xi_{i}\right)-\sum_{i=1}^{N} \mu_{i} \xi_{i} L(w,b,ξ,α,μ)21w2+Ci=1Nξi2i=1Nαi(yi(wxi+b)1+ξi)i=1Nμiξi
其中
α i ⩾ 0 , μ i ⩾ 0 \alpha_{i} \geqslant 0, \mu_{i} \geqslant 0 αi0,μi0
首先求 L ( w , b , ξ , α , μ ) L(w, b, \xi, \alpha, \mu) L(w,b,ξ,α,μ) w , b , ξ w, b, \xi w,b,ξ的极小
∇ w L ( w , b , ξ , α , μ ) = w − ∑ i = 1 N α i y i x i = 0 \nabla_{w} L(w, b, \xi, \alpha, \mu)=w-\sum_{i=1}^{N} \alpha_{i} y_{i} x_{i}=0 wL(w,b,ξ,α,μ)=wi=1Nαiyixi=0
∇ b L ( w , b , ξ , α , μ ) = − ∑ i = 1 N α i y i = 0 \nabla_{b} L(w, b, \xi, \alpha, \mu)=-\sum_{i=1}^{N} \alpha_{i} y_{i}=0 bL(w,b,ξ,α,μ)=i=1Nαiyi=0
∇ ξ i L ( w , b , ξ , α , μ ) = 2 C ξ i − α i − μ i = 0 \nabla_{\xi_{i}} L(w, b, \xi, \alpha, \mu)=2C\xi_{i}-\alpha_{i}-\mu_{i}=0 ξiL(w,b,ξ,α,μ)=2Cξiαiμi=0
得到:
w = ∑ i = 1 N α i y i x i w=\sum_{i=1}^{N} \alpha_{i} y_{i} x_{i} w=i=1Nαiyixi
∑ i = 1 N α i y i = 0 \sum_{i=1}^{N} \alpha_{i} y_{i}=0 i=1Nαiyi=0
2 C ξ i − α i − μ i = 0 2C\xi_{i}-\alpha_{i}-\mu_{i}=0 2Cξiαiμi=0
带入原式
min ⁡ w , b , ξ L ( w , b , ξ , α , μ ) = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j ( x i ⋅ x j ) + ∑ i = 1 N α i − C ∑ i = 1 N ξ i 2 = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j ( x i ⋅ x j ) + ∑ i = 1 N α i − C ∑ i = 1 N ( α i + μ i 2 C ) 2 = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j ( x i ⋅ x j ) + ∑ i = 1 N α i − 1 4 C ∑ i = 1 N ( α i + μ i ) 2 \min _{w, b, \xi} L(w, b, \xi, \alpha, \mu)\\=-\frac{1}{2} \sum_{i=1}^{N} \sum_{j=1}^{N} \alpha_{i} \alpha_{j} y_{i} y_{j}\left(x_{i} \cdot x_{j}\right)+\sum_{i=1}^{N} \alpha_{i}-C \sum_{i=1}^{N} \xi_{i}^2\\=-\frac{1}{2} \sum_{i=1}^{N} \sum_{j=1}^{N} \alpha_{i} \alpha_{j} y_{i} y_{j}\left(x_{i} \cdot x_{j}\right)+\sum_{i=1}^{N} \alpha_{i}-C \sum_{i=1}^{N} (\frac{\alpha_i+\mu_i}{2C})^2\\=-\frac{1}{2} \sum_{i=1}^{N} \sum_{j=1}^{N} \alpha_{i} \alpha_{j} y_{i} y_{j}\left(x_{i} \cdot x_{j}\right)+\sum_{i=1}^{N} \alpha_{i}-\frac{1}{4C}\sum_{i=1}^{N} (\alpha_i+\mu_i)^2 minw,b,ξL(w,b,ξ,α,μ)=21i=1Nj=1Nαiαjyiyj(xixj)+i=1NαiCi=1Nξi2=21i=1Nj=1Nαiαjyiyj(xixj)+i=1NαiCi=1N(2Cαi+μi)2=21i=1Nj=1Nαiαjyiyj(xixj)+i=1Nαi4C1i=1N(αi+μi)2
再对 min ⁡ w , b , ξ L ( w , b , ξ , α , μ ) \min _{w, b, \xi} L(w, b, \xi, \alpha, \mu) minw,b,ξL(w,b,ξ,α,μ) α \alpha α的极大,得对偶问题
m a x α − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j ( x i ⋅ x j ) + ∑ i = 1 N α i − 1 4 C ∑ i = 1 N ( α i + μ i ) 2 max_\alpha \quad -\frac{1}{2} \sum_{i=1}^{N} \sum_{j=1}^{N} \alpha_{i} \alpha_{j} y_{i} y_{j}\left(x_{i} \cdot x_{j}\right)+\sum_{i=1}^{N} \alpha_{i}-\frac{1}{4C}\sum_{i=1}^{N} (\alpha_i+\mu_i)^2 maxα21i=1Nj=1Nαiαjyiyj(xixj)+i=1Nαi4C1i=1N(αi+μi)2
s . t ∑ i = 1 N α i y i = 0 s.t\sum_{i=1}^{N} \alpha_{i} y_{i}=0 s.ti=1Nαiyi=0
α i , μ i > 0 i = 1 , 2... N \alpha_{i}, \mu_{i} > 0\qquad i=1,2...N αi,μi>0i=1,2...N

习题7.4

题目:
证明内积的正整数幂函数:
K ( x , z ) = ( x ⋅ z ) p K(x, z)=(x \cdot z)^{p} K(x,z)=(xz)p
是正定核函数,这里 p p p是正整数, x , z ϵ R n x,z\epsilon R_{n} x,zϵRn
解答:
要证明 K ( x , z ) K(x,z) K(x,z)为正定核,有两种想法,一种是证明根据公式(7.26),证明 K ( x , z ) K(x,z) K(x,z)满足 K ( x , z ) = ϕ ( x ) ⋅ ϕ ( z ) K(x,z)=\phi(x)\cdot \phi(z) K(x,z)=ϕ(x)ϕ(z)一种是根据定理7.5,证明对任意 x i ϵ X , i = 1 , 2... m , K ( x , z ) x_i\epsilon X, i=1,2...m, K(x,z) xiϵX,i=1,2...m,K(x,z)对应的 G r a m Gram Gram矩阵 K = [ K ( x i , x j ) ] m ∗ m K = [K(x_i, x_j)]_{m*m} K=[K(xi,xj)]mm为半正定矩阵。
我用数学归纳法证明 K ( x , z ) = ϕ ( x ) ⋅ ϕ ( z ) K(x,z)=\phi(x)\cdot \phi(z) K(x,z)=ϕ(x)ϕ(z)
(1) p = 1 p=1 p=1 K ( x , z ) = x ⋅ z K(x,z)=x\cdot z K(x,z)=xz,取 ϕ 1 ( x ) = x \phi_1(x)=x ϕ1(x)=x,满足条件

(2)假设 p = k p=k p=k时, K ( x , z ) K(x,z) K(x,z) 为正定核,即有 K ( x , z ) = ϕ k ( x ) ⋅ ϕ k ( z ) K(x,z)=\phi_k(x)\cdot \phi_k(z) K(x,z)=ϕk(x)ϕk(z)

(3)那么当 p = k + 1 p=k+1 p=k+1
K ( x , z ) = ( x ⋅ z ) k ( x ⋅ z ) = ϕ k ( x ) ⋅ ϕ k ( z ) ( x ⋅ z ) K(x,z) \\= (x\cdot z)^{k}(x\cdot z)\\=\phi_k(x)\cdot \phi_k(z)(x\cdot z) K(x,z)=(xz)k(xz)=ϕk(x)ϕk(z)(xz)
现在假设 ϕ k ( x ) = ( f 1 ( x ) , f 2 ( x ) . . . , f m ( x ) ) T , x = ( x 1 , x 2 . . . x n ) , z = ( z 1 , z 2 . . . z n ) T \phi_k(x)=(f_1(x),f_2(x)...,f_m(x))^T,x=(x_1,x_2...x_n),z=(z_1,z_2...z_n)^T ϕk(x)=(f1(x),f2(x)...,fm(x))T,x=(x1,x2...xn),z=(z1,z2...zn)T
则有
K ( x , z ) = ( f 1 ( x ) f 1 ( z ) + f 2 ( x ) f 2 ( z ) + . . . + f m ( x ) f m ( z ) ) ( x 1 z 1 + x 2 z 2 + . . . + x n z n ) = f 1 ( x ) f 1 ( z ) ( x 1 z 1 + x 2 z 2 + . . . + x n z n ) + f 2 ( x ) f 2 ( z ) ( x 1 z 1 + x 2 z 2 + . . . + x n z n ) + . . . + f m ( x ) f m ( z ) ( x 1 z 1 + x 2 z 2 + . . . + x n z n ) K(x,z)\\=(f_1(x)f_1(z)+f_2(x)f_2(z)+...+f_m(x)f_m(z))(x_1z_1+x_2z_2+...+x_nz_n)\\=f_1(x)f_1(z)(x_1z_1+x_2z_2+...+x_nz_n)+f_2(x)f_2(z)(x_1z_1+x_2z_2+...+x_nz_n)+...+f_m(x)f_m(z)(x_1z_1+x_2z_2+...+x_nz_n) K(x,z)=(f1(x)f1(z)+f2(x)f2(z)+...+fm(x)fm(z))(x1z1+x2z2+...+xnzn)=f1(x)f1(z)(x1z1+x2z2+...+xnzn)+f2(x)f2(z)(x1z1+x2z2+...+xnzn)+...+fm(x)fm(z)(x1z1+x2z2+...+xnzn)
那么我们就可以取
ϕ k + 1 ( x ) = ( f 1 ( x ) x 1 , f 1 ( x ) x 2 . . . f 1 ( x ) x n , f 2 ( x ) x 1 , f 2 ( x ) x 2 , . . . , f 2 ( x ) x n , . . . . . . f m ( x ) x 1 , f m ( x ) x 2 , . . . f m ( x ) x n ) T \phi_{k+1}(x)=(f_1(x)x_1, f_1(x)x_2...f_1(x)x_n,f_2(x)x_1,f_2(x)x_2,...,f_2(x)x_n,......f_m(x)x_1,f_m(x)x_2,...f_m(x)x_n)^T ϕk+1(x)=(f1(x)x1,f1(x)x2...f1(x)xn,f2(x)x1,f2(x)x2,...,f2(x)xn,......fm(x)x1,fm(x)x2,...fm(x)xn)T
K ( x , z ) = ( x ⋅ z ) k + 1 = ϕ k + 1 ( x ) ⋅ ϕ k + 1 ( z ) K(x,z) = (x\cdot z)^{k+1}=\phi_{k+1}(x)\cdot \phi_{k+1}(z) K(x,z)=(xz)k+1=ϕk+1(x)ϕk+1(z)
得证


贴一个先前用半正定矩阵证明的方法,不过这个解法有点问题(关于用证明矩阵半正定的证明方法,有读者有好的想法可以在评论区一起讨论)
根据定理7.5,正定核函数的充要条件是对任意 x i ϵ X , i = 1 , 2... m , K ( x , z ) x_i\epsilon X, i=1,2...m, K(x,z) xiϵX,i=1,2...m,K(x,z)对应的 G r a m Gram Gram矩阵 K = [ K ( x i , x j ) ] m ∗ m K = [K(x_i, x_j)]_{m*m} K=[K(xi,xj)]mm为半正定矩阵
对任意 c 1 , c 2 , . . . , c m ϵ R c_1,c_2,...,c_m\epsilon R c1,c2,...,cmϵR
∑ i , j = 1 m c i c j K ( x i , x j ) = ∑ i , j = 1 m c i c j ( x i ⋅ x j ) p = ∑ i = 1 m ( c i x i ⋅ c j x j ) ( x i ⋅ x j ) p − 1 \sum^m_{i,j=1}c_{i}c_jK(x_{i},x_j)\\=\sum^m_{i,j=1}c_{i}c_j(x_i\cdot x_j)^p\\=\sum^m_{i=1}(c_ix_i\cdot c_jx_j)(x_{i}\cdot x_{j})^{p-1} i,j=1mcicjK(xi,xj)=i,j=1mcicj(xixj)p=i=1m(cixicjxj)(xixj)p1
先来看前面一部分 ( c i x i ⋅ c j x j ) (c_ix_i\cdot c_jx_j) (cixicjxj)
∑ i = 1 m ( c i x i ⋅ c j x j ) = ( ∑ i = 1 m c i x i ) ⋅ ( ∑ j = 1 m c j x j ) = ∣ ∣ ∑ i = 1 m c i x i ∣ ∣ 2 ≥ 0 \sum^m_{i=1}(c_ix_i\cdot c_jx_j)\\=(\sum^m_{i=1}c_ix_i)\cdot (\sum^m_{j=1}c_jx_j)\\=||\sum^m_{i=1}c_ix_i||^2\geq 0 i=1m(cixicjxj)=(i=1mcixi)(j=1mcjxj)=i=1mcixi20
再看后面一部分 ( x i ⋅ x j ) p − 1 p ≥ 0 (x_{i}\cdot x_{j})^{p-1}\quad p\geq 0 (xixj)p1p0
对于
∑ i = 1 m ( x i ⋅ x j ) = ( ∑ i = 1 m x i ) ⋅ ( ∑ i = 1 m x i ) = ∣ ∣ ∑ i = 1 m x i ∣ ∣ 2 ≥ 0 \sum^m_{i=1}(x_{i}\cdot x_{j})\\=(\sum^m_{i=1}x_{i})\cdot (\sum^m_{i=1}x_{i})\\=||\sum^m_{i=1}x_{i}||^2\quad \geq 0 i=1m(xixj)=(i=1mxi)(i=1mxi)=i=1mxi20
而对于
∑ i = 1 m a i b i s . t ∑ i = 1 m a i , ∑ i = 1 m b i ≥ 0 \sum_{i=1}^ma_ib_i\\s.t\sum^m_{i=1}a_i,\sum^m_{i=1}b_i\quad \geq0 i=1maibis.ti=1mai,i=1mbi0
一定有 ∑ i = 1 m a i b i ≥ 0 \sum_{i=1}^ma_ib_i\geq0 i=1maibi0
所以有 ∑ i = 1 m ( c i x i ⋅ c j x j ) ( x i ⋅ x j ) p − 1 ≥ 0 \sum^m_{i=1}(c_ix_i\cdot c_jx_j)(x_{i}\cdot x_{j})^{p-1}\geq0 i=1m(cixicjxj)(xixj)p10

  • 14
    点赞
  • 119
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值