李航统计学习方法之感知机学习（含感知机原始形式和对偶形式Python代码实现）

最新推荐文章于 2024-01-23 11:06:37 发布

代码随想随记

最新推荐文章于 2024-01-23 11:06:37 发布

阅读量2.2k

点赞数

分类专栏： python 文章标签：机器学习感知机

本文链接：https://blog.csdn.net/woai8339/article/details/82911650

版权

python 专栏收录该内容

75 篇文章

订阅专栏

感知机

感知机基本介绍

感知机是一个线性二分模型，输出取值为{-1, 1}。是判别模型。
　　感知机是为了求解一个超平面，该超平面能够将特征空间里的实例分解为正例和负例。设超平面方程为y=w*x+b，因此，引入基于误分类点的损失函数。如果损失函数为误分类点个数，则该损失函数不是w和b的连续可导函数，不利于优化。我们考虑另一个选择：误分类点距离该超平面的距离。
　　假设空间任意一点 $x_0,y_0)$ 到平面 $y = w * x + b$ 的距离可以写成： $\frac {1} {||w||} * |w*x_0+b|$ ，其中， $∣ ∣ w ∣ ∣$ 表示 $w$ 的二范数。那么误分类点到该平面距离为 $y_i*(w*x_i+b) > 0$ ,这是因为对于误分类点，对于 $y_i=1$ 有 $w*x_i+b<0$ , $y_i=-1$ 有 $w*x_i+b>0$ 。
　　因此，对于误分类点 $x_i, y_i)$ ，该点到平面的距离为 $\frac {1} {||w||}*[-y_i*(w*x_i)]$ 。假设误分类点总共为 $m$ 个，则误所有误分类点到分解面的距离为 $\sum_{i=1}^{m} \frac {1} {||w||}*[-y_i*(w*x_i)]$ ，如果不考虑 $\frac{1}{||w||}$ ，那么损失函数即为： $\sum_{i=1}^{m} -y_i*(w*x_i)$

感知机原始形式

感知机有原始形式和对偶形式，我们先看原始形式。
　　原始形式如下：
　　感知机原始形式

感知机算法如下：
感知机算法原始算法
可以在步骤(2)之前定义一个boolean变量，判断当前 $w$ 和 $b$ 的情况下是否存在误分类点，误分类点判断逻辑为:遍历所有点，判断 $y_i*(w*x_i+b)<=0$ 是否成立。
例题2.1的python 版本代码实现：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.utils import shuffle


# In[266]:

#　examole 2.1
# python版本
x = np.array([[3,3], [4,3], [1,1]])
y = [1, 1, -1]


# In[306]:

"""
np.dot([2,2],[1,1])
4
np.dot([2,2],[[1],[1]])
array([4])
"""
w = [0 ,0]
b = 0
yita = 1


# In[307]:

#是否还存在误分类点
def isHasMisclassification(x, y, w, b):
    misclassification = False
    ct = 0
    misclassification_index = 0 
    for i in range(0, len(y)):
        if y[i]*(np.dot(w, x[i]) + b) <= 0:
            ct += 1
            misclassification_index = i
    if ct>0:
        misclassification = True
    return misclassification, misclassification_index

# In[308]:

# 更新系数w, b
def update(x, y, w, b, i):
    w = w + y[i]*x[i]
    b = b + y[i]
    return w, b


# In[309]:

#更新迭代
import random
def optimization(x, y, w, b):
    misclassification, misclassification_index = isHasMisclassification(x, y, w, b)
    while misclassification:
        print ("误分类的点：", misclassification_index)
        w, b = update(x, y, w, b, misclassification_index)
        print ("采用误分类点 {} 更新后的权重为:w是 {} , b是 {} ".format(misclassification_index, w, b))
        misclassification, misclassification_index = isHasMisclassification(x, y, w, b)
    return w, b


# In[310]:

optimization(x, y, w, b)


# In[311]:

w, b = optimization(x, y, w, b)


# In[312]:

w,b

感知机对偶形式

在这里插入图片描述
感知机对偶形式的算法流程为：

对偶形式代码为：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.utils import shuffle


# In[266]:

#　examole 2.2 (对偶形式)
# python版本
x = np.array([[3,3], [4,3], [1,1]])
x_transpose = x.T
g = np.dot(x, x_transpose)
y = [1, 1, -1]


# In[306]:

"""
np.dot([2,2],[1,1])
4
np.dot([2,2],[[1],[1]])
array([4])
"""
alfa = np.array([0, 0, 0])
b = 0
yita = 1


# In[307]:

#是否还存在误分类点
def isHasMisclassification(y, g, b):
    misclassification = False
    ct = 0
    misclassification_index = 0
    for i in range(0, len(y)):
        sum1 = 0
        for j in range(0, len(y)):
            sum1 += (alfa[j]*y[j]*g[j][i] + b)
        if y[i]*sum1 <= 0:
            ct += 1
            misclassification_index = i
    if ct > 0:
        misclassification = True
    return misclassification, misclassification_index

# In[308]:

# 更新系数alfa, b
def update(y, alfa, yita, b, i):
    alfa[i] = alfa[i] + yita
    b = b + yita*y[i]
    return alfa, b


# In[309]:

#更新迭代
import random
def optimization(y, alfa, b, yita):
    misclassification, misclassification_index = isHasMisclassification(y, g, b)
    while misclassification:
        print ("误分类的第{}点{}：".format(misclassification_index, x[misclassification_index]))
        alfa, b = update(y, alfa, yita, b, misclassification_index)
        print ("采用第{}误分类点 {} 更新后的权重为:alfa是 {} , b是 {} ".format(misclassification_index, x[misclassification_index], alfa, b))
        misclassification, misclassification_index = isHasMisclassification(y, g, b)
    return alfa, b


# In[310]:

optimization(y, alfa, b, yita)


# In[311]:

alfa, b = optimization(y, alfa, b, yita)


# In[312]:
#w=sum(alfa_i*y_i*x_i)
alfa_y = np.multiply(list(alfa),y)
w = np.dot(alfa_y,x)
b = np.dot(alfa, y)
print("w是{},b是{}".format(w, b))

此次运行结果是：

误分类的第2点[1 1]：
采用第2误分类点 [1 1] 更新后的权重为:alfa是 [0 0 1] , b是 -1 
误分类的第1点[4 3]：
采用第1误分类点 [4 3] 更新后的权重为:alfa是 [0 1 1] , b是 0 
误分类的第2点[1 1]：
采用第2误分类点 [1 1] 更新后的权重为:alfa是 [0 1 2] , b是 -1 
误分类的第2点[1 1]：
采用第2误分类点 [1 1] 更新后的权重为:alfa是 [0 1 3] , b是 -2 
误分类的第1点[4 3]：
采用第1误分类点 [4 3] 更新后的权重为:alfa是 [0 2 3] , b是 -1 
误分类的第2点[1 1]：
采用第2误分类点 [1 1] 更新后的权重为:alfa是 [0 2 4] , b是 -2 
误分类的第2点[1 1]：
采用第2误分类点 [1 1] 更新后的权重为:alfa是 [0 2 5] , b是 -3 
误分类的第2点[1 1]：
采用第2误分类点 [1 1] 更新后的权重为:alfa是 [0 2 6] , b是 -1 
w是[2 0],b是-4

即：分割面为 $y=sign(2*x_1-4)$ ，分别带入第0/1/2三个点 $(3, 3), (4, 3), (1, 1)$ 验证y为 $(s i g n (2), s i g n (4), s i g n (- 2))$ ，即label为 $(1, 1, - 1)$