Perceptron Learning Algorithm
一、条件
样本集线性可分
二、原理
寻找一个超平面/直线将两类样本分开
h ( x i 1 , x i 2 , ⋯ , x i d ) = sign ( ∑ j = 1 d w j x i j − θ ) h(x_{i1},x_{i2},\cdots,x_{id}) = \text{sign}(\sum\limits_{j=1}^d w_jx_{ij} -\theta) h(xi1,xi2,⋯,xid)=sign(j=1∑dwjxij−θ) , i = 1 , 2 , ⋯ , n i=1,2,\cdots,n i=1,2,⋯,n
w j w_j wj 可看成生物神经元的权重, x i j x_{ij} xij 可看成生物神经元的刺激, θ \theta θ 为阈值。当 ∑ j = 1 d w j x i j > θ \sum\limits_{j=1}^d w_jx_{ij}>\theta j=1∑dwjxij>θ 时神经元兴奋, ∑ j = 1 d w j x i j < θ \sum\limits_{j=1}^d w_jx_{ij} <\theta j=1∑dwjxij<θ 时神经元抑制。因此该方法称为感知器学习算法
令 x i 0 = 1 x_{i0}=1 xi0=1 , w 0 = θ w_0=\theta w0=θ , i = 1 , 2 , ⋯ , n i = 1,2,\cdots,n i=1,2,⋯,n 即 x ⃗ i = [ 1 x i 1 x i 2 ⋯ x i n ] T \vec x_i = \begin{bmatrix} 1&x_{i1}&x_{i2}&\cdots&x_{in} \end{bmatrix}^T xi=[1xi1xi2⋯xin]T , w ⃗ = [ θ w 1 w 2 ⋯ w n ] T \vec w = \begin{bmatrix} \theta&w_1&w_2&\cdots&w_n \end{bmatrix}^T w=[θw1w2⋯wn]T 则 h ( x ⃗ i ) = sign ( w ⃗ T ⋅ x ⃗ i ) h(\vec x_i)=\text{sign}(\vec w^T\cdot\vec x_i) h(xi)=sign(wT⋅xi)
-
构造损失函数
-
L
(
h
)
=
∑
i
=
1
n
I
(
h
(
x
⃗
i
)
≠
y
i
)
L(h) = \sum\limits_{i=1}^n\mathbb{I}(h(\vec x_i) \neq y_i)
L(h)=i=1∑nI(h(xi)=yi)
即当前假设下被错分样本的个数,但此函数不连续,难以用数学方法求最优值 -
L
(
w
⃗
)
=
−
∑
x
⃗
∈
y
w
⃗
T
⋅
x
⃗
L(\vec w) = -\sum\limits_{\vec x\in}y\vec w^T \cdot\vec x
L(w)=−x∈∑ywT⋅x
当样本被错分时, y y y 与 w ⃗ T ⋅ x ⃗ \vec w^T\cdot\vec x wT⋅x 异号
-
L
(
h
)
=
∑
i
=
1
n
I
(
h
(
x
⃗
i
)
≠
y
i
)
L(h) = \sum\limits_{i=1}^n\mathbb{I}(h(\vec x_i) \neq y_i)
L(h)=i=1∑nI(h(xi)=yi)
-
求损失函数取最小值 0 时对应的假设 h h h