注:前面的习题待完善,稍后补充。
7.3
min w , b , ξ 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2 s . t . y i ( w ⋅ x i + b ) ≥ 1 − ξ i , i = 1 , 2 , … , N ξ i ≥ 0 , i = 1 , 2 , … , N \min_{w,b,\xi} \frac{1}{2} {\parallel w \parallel}^2 + C \sum_{i=1}^N {\xi_i}^2 \\s.t. \quad y_i(w \cdot x_i + b) \ge 1 - \xi_i, i =1,2,\ldots,N \\ \xi_i \ge0, i=1,2,\ldots,N w,b,ξmin21∥w∥2+Ci=1∑Nξi2s.t.yi(w⋅xi+b)≥1−ξi,i=1,2,…,Nξi≥0,i=1,2,…,N
对应的拉格朗日函数是 L ( w , b , ξ , α , γ ) = 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2 + ∑ i = 1 N α i ( 1 − ξ i − y i ( w ⋅ x i + b ) ) − ∑ i = 1 N γ i ξ i L(w,b,\xi,\alpha,\gamma) = \frac{1}{2} {\parallel w \parallel}^2 + C \sum_{i=1}^N {\xi_i}^2 +\sum_{i=1}^N \alpha_i ( 1 - \xi_i -y_i(w \cdot x_i + b) ) -\sum_{i=1}^N \gamma_i \xi_i L(w,b,ξ,α,γ)=21∥w∥2+C∑i=1Nξi2+∑i=1Nαi(1−ξi−yi(w⋅xi+b))−∑i=1Nγiξi
使用KKT条件得到
∂ L ∂ w = w − ∑ i = 1 N α i y i x i = 0 ∂ L ∂ b = − ∑ i = 1 N α i y i = 0 ∂ L ∂ ξ i = 2 C ξ i − α i − γ i = 0 \frac{\partial L}{\partial w} = w - \sum_{i=1}^N \alpha_i y_i x_i = 0 \\ \frac{\partial L}{\partial b} = -\sum_{i=1}^N \alpha_i y_i = 0 \\ \frac{\partial L}{\partial \xi_i} = 2C \xi_i - \alpha_i - \gamma_i=0 ∂w∂L=w−i=1∑Nαiyixi=0∂b∂L=−i=1∑Nαiyi=0∂ξi∂L=2Cξi−αi−γi=0
因此
w = ∑ i = 1 N α i y i x i ∑ i = 1 N α i y i = 0 2 C ξ i = α i + γ i w = \sum_{i=1}^N \alpha_i y_i x_i \\ \sum_{i=1}^N \alpha_i y_i = 0 \\ 2C \xi_i = \alpha_i + \gamma_i w=i=1∑Nαiyixii=1∑Nαiyi=02Cξi=αi+γi
代入拉格朗日函数可得
min w , b , ξ L ( w , b , ξ , α , γ ) = 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2 + ∑ i = 1 N α i − ∑ i = 1 N ( α i + γ i ) ξ i − ∑ i = 1 N α i y i w ⋅ x i − ∑ i = 1 N α i y i b = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j x i x j + ∑ i = 1 N α i − 1 2 ∑ i = 1 N ( α i + γ i ) ξ i = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j x i x j + ∑ i = 1 N α i − 1 2 ∑ i = 1 N ( α i + γ i ) α i + γ i 2 C = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j x i x j + ∑ i = 1 N α i − 1 4 C ∑ i = 1 N ( α i + γ i ) 2 \min_{w,b,\xi} L(w,b,\xi,\alpha,\gamma) = \frac{1}{2} {\parallel w \parallel}^2 + C \sum_{i=1}^N {\xi_i}^2 +\sum_{i=1}^N \alpha_i -\sum_{i=1}^N (\alpha_i+\gamma_i) \xi_i -\sum_{i=1}^N \alpha_i y_i w \cdot x_i - \sum_{i=1}^N \alpha_i y_i b \\ =- \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{2}\sum_{i=1}^N (\alpha_i+\gamma_i) \xi_i \\= - \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{2}\sum_{i=1}^N (\alpha_i+\gamma_i)\frac{\alpha_i+\gamma_i}{2C}\\= - \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{4C}\sum_{i=1}^N (\alpha_i+\gamma_i)^2 w,b,ξminL(w,b,ξ,α,γ)=21∥w∥2+Ci=1∑Nξi2+i=1∑Nαi−i=1∑N(αi+γi)ξi−i=1∑Nαiyiw⋅xi−i=1∑Nαiyib=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−21i=1∑N(αi+γi)ξi=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−21i=1∑N(αi+γi)2Cαi+γi=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−4C1i=1∑N(αi+γi)2
对偶问题为
max
α
W
(
α
)
=
−
1
2
∑
i
=
1
N
∑
j
=
1
N
α
i
α
j
y
i
y
j
x
i
x
j
+
∑
i
=
1
N
α
i
−
1
4
C
∑
i
=
1
N
(
α
i
+
γ
i
)
2
s
.
t
.
∑
i
=
1
N
α
i
y
i
=
0
α
i
≥
0
,
γ
i
≥
0
,
i
=
1
,
2
,
…
,
N
\max_{\alpha} W(\alpha) =- \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{4C}\sum_{i=1}^N (\alpha_i+\gamma_i)^2\\s.t. \quad \sum_{i=1}^N \alpha_i y_i=0 \\ \alpha_i \ge0, \gamma_i \ge0,i=1,2,\ldots,N
αmaxW(α)=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−4C1i=1∑N(αi+γi)2s.t.i=1∑Nαiyi=0αi≥0,γi≥0,i=1,2,…,N
7.4
对 p p p进行数学归纳。
当 p = 1 p=1 p=1时, K ( x , z ) = x ⋅ z K(x,z) = x \cdot z K(x,z)=x⋅z, 则 ϕ ( x ) = x \phi(x) = x ϕ(x)=x
假设 p = k p=k p=k时, K ( x , z ) = ( x ⋅ z ) k = ϕ k ( x ) ⋅ ϕ k ( z ) K(x,z) = (x \cdot z )^k=\phi_k(x) \cdot \phi_k(z) K(x,z)=(x⋅z)k=ϕk(x)⋅ϕk(z)
当 p = k p=k p=k时, K ( x , z ) = ( x ⋅ z ) k + 1 = ( x ⋅ z ) k ( x ⋅ z ) = ϕ k ( x ) ⋅ ϕ k ( z ) ( x ⋅ z ) K(x,z) = (x \cdot z )^{k+1} = (x \cdot z )^{k} (x \cdot z) = \phi_k(x) \cdot \phi_k(z) (x \cdot z) K(x,z)=(x⋅z)k+1=(x⋅z)k(x⋅z)=ϕk(x)⋅ϕk(z)(x⋅z)
不妨设 ϕ k ( x ) = ( f 1 ( x ) , f 2 ( x ) , … , f m ( x ) ) T , x = ( x 1 , x 2 , … , x n ) T \phi_k(x) =( f_1(x),f_2(x),\ldots,f_m(x))^T, x = (x_1,x_2,\ldots,x_n)^T ϕk(x)=(f1(x),f2(x),…,fm(x))T,x=(x1,x2,…,xn)T
则 K ( x , z ) = ( f 1 ( x ) f 1 ( z ) + f 2 ( x ) f 2 ( z ) + … + f m ( x ) f m ( z ) ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) = f 1 ( x ) f 1 ( z ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) + f 2 ( x ) f 2 ( z ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) + … + f m ( x ) f m ( z ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) = ( f 1 ( x ) x 1 ) ( f 1 ( z ) z 1 ) + ( f 1 ( x ) x 2 ) ( f 1 ( z ) z 2 ) + … + ( f 1 ( x ) x n ) ( f 1 ( z ) z n ) + ( f 2 ( x ) x 1 ) ( f 2 ( z ) z 1 ) + … + ( f 2 ( x ) x n ) ( f 2 ( z ) z n ) + ( f m ( x ) x 1 ) ( f m ( z ) z 1 ) + … + ( f m ( x ) x n ) ( f m ( z ) z n ) : = ϕ k + 1 ( x ) ⋅ ϕ k + 1 ( z ) K(x,z) =(f_1(x)f_1(z) + f_2(x)f_2(z) + \ldots + f_m(x)f_m(z))(x_1 z_1+x_2 z_2+ \ldots +x_n z_n) \\ =f_1(x)f_1(z) (x_1 z_1+x_2 z_2+ \ldots +x_n z_n) +f_2(x)f_2(z)(x_1 z_1+x_2 z_2+ \ldots +x_n z_n)+ \ldots + f_m(x)f_m(z)(x_1 z_1+x_2 z_2+ \ldots +x_n z_n) \\ =(f_1(x)x_1)(f_1(z)z_1) +(f_1(x)x_2)(f_1(z)z_2) + \ldots +(f_1(x)x_n)(f_1(z)z_n) + (f_2(x)x_1)(f_2(z)z_1) + \ldots \\ +(f_2(x)x_n)(f_2(z)z_n) + (f_m(x)x_1)(f_m(z)z_1) + \ldots +(f_m(x)x_n)(f_m(z)z_n) \\ :=\phi_{k+1}(x) \cdot \phi_{k+1}(z) K(x,z)=(f1(x)f1(z)+f2(x)f2(z)+…+fm(x)fm(z))(x1z1+x2z2+…+xnzn)=f1(x)f1(z)(x1z1+x2z2+…+xnzn)+f2(x)f2(z)(x1z1+x2z2+…+xnzn)+…+fm(x)fm(z)(x1z1+x2z2+…+xnzn)=(f1(x)x1)(f1(z)z1)+(f1(x)x2)(f1(z)z2)+…+(f1(x)xn)(f1(z)zn)+(f2(x)x1)(f2(z)z1)+…+(f2(x)xn)(f2(z)zn)+(fm(x)x1)(fm(z)z1)+…+(fm(x)xn)(fm(z)zn):=ϕk+1(x)⋅ϕk+1(z)
其中 ϕ k + 1 ( x ) = ( f 1 ( x ) x 1 , f 1 ( x ) x 2 , … , f 1 ( x ) x n , f 2 ( x ) x 1 , … , f 2 ( x ) x n , … , f m ( x ) x 1 , … , f m ( x ) x n ) T \phi_{k+1}(x)=(f_1(x)x_1, f_1(x)x_2, \ldots, f_1(x)x_n, f_2(x)x_1, \ldots, f_2(x)x_n, \ldots,f_m(x)x_1, \ldots, f_m(x)x_n )^T ϕk+1(x)=(f1(x)x1,f1(x)x2,…,f1(x)xn,f2(x)x1,…,f2(x)xn,…,fm(x)x1,…,fm(x)xn)T
因此 K ( x , z ) = ( x ⋅ z ) p K(x,z) = (x \cdot z )^p K(x,z)=(x⋅z)p是正定核。