第二章感知机

澳大利亚有群羊

于 2022-09-13 19:41:36 发布

阅读量184

点赞数

分类专栏：统计学习方法文章标签：机器学习人工智能深度学习

本文链接：https://blog.csdn.net/weixin_42030574/article/details/126827876

版权

统计学习方法专栏收录该内容

2 篇文章 0 订阅

订阅专栏

`
感知机（perceptron）是二类分类的线性分类模型，它包括输入空间、输出空间、模型结构、参数空间和假设空间。感知机学习旨在求出将训练数据进行线性划分的分离超平面，为此导入基于误分类的损失函数，利用梯度下降法对损失函数进行极小化，求得感知机模型。。。

提示：写完文章后，目录可以自动生成，如何生成可参考右边的帮助文档

一、感知机模型

在这里插入图片描述

在几何机构上，用线性方程：w点乘x+b = 0
通过超平面S我们就可以将整个特征空间分为两部分，一部分是正类，其中的实例所对应的输出为+1，一部分为负类，它里面的实例所对应的输出为-1所以这个超平面被称为分离超平面。

提示：以下是本篇文章正文内容，下面案例可供参考

二、感知机的学习策略？

感知机要求数据集必须是线性可分的
在这里插入图片描述

如果假设训练数据集线性可分，我们的目标则是希望寻求到一个很棒的分离超平面，把这些实例点完全划分为正负类。
但是，要求得这样一个超平面，就需要确定模型的参数，这就需要制定一定的学习策略。
换而言之，就是要合理地定义感知机相应的损失函数。

在这里插入图片描述

二、感知机学习算法

感知机学习算法是基于随机梯度下降法的对损失函数的最优化算法，有原始形式和对偶形式。算法简单且易于实现。原始形式中，首先任意选取一个超平面，然后用梯度下降法不断极小化目标函数。在这个过程中一次随机选取一个误分类点使其梯度下降。

1.原始形式

感知机学习算法是误分类驱动的，具体采用随机梯度下降法。首先，任意选取一个超平面w0,b0,然后用梯度下降法不断地极小化目标函数。极小化过程中不是一次使M中所有误分类点的梯度下降，而是一次随机选取一个误分类点使其梯度下降。

代码如下

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
from timeit import default_timer as timer
%matplotlib inline

iris = load_iris()
df = pd.DataFrame(iris.data,columns = iris.feature_names)
df['label'] = iris.target
df.columns =['sepal length','sepal width','petal length','petal width','label']

plt.scatter(df[:50]['sepal length'],df[:50]['sepal width'],label='0')
plt.scatter(df[50:100]['sepal length'],df[50:100]['sepal width'],label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()

在这里插入图片描述
拿出iris数据集中两个分类的数据和[sepal length，sepal width]的分布状况

data = np.array(df.iloc[:100,[0,1,-1]])
X,y = data[:,:-1],data[:,-1]
y = np.array([1 if i==1 else -1 for i in y])  # 划分正实例点和负实例点，标签1为+1，标签0为-1.

采用感知机的原始形式

# 感知机的原始形式
# 数据线性可分，二分类数据
class Model:
    def __init__(self):
        self.w = np.ones(len(data[0])-1,dtype = np.float32)
        self.b = 0
        self.l_rate = 0.1
        
    def sign(self,x,w,b):
        y = np.dot(x,w)+b
        return y
    
    # 随机梯度下降法
    def fit(self,X_train,y_train):
        is_wrong = False
        while not is_wrong:
            wrong_count = 0
            for d in range(len(X_train)):
                X = X_train[d]
                y = y_train[d]
                if y*self.sign(X,self.w,self.b) <=0:
                    self.w = self.w + self.l_rate*np.dot(y,X)
                    self.b = self.b+self.l_rate*y
                    wrong_count+=1
            
            if wrong_count == 0:
                is_wrong = True
        return 'model is successing!'
    def score(self):
        pass
tic = timer()
perceptron = Model()
perceptron.fit(X,y)

x_points = np.linspace(4,7,10)
print(x_points,perceptron.b)
y_ = -(perceptron.w[0]*x_points+perceptron.b)/perceptron.w[1] # 分离超平面:::斜率-w[0]/w[1],截距-b/w[1]
plt.plot(x_points,y_)

plt.scatter(df[:50]['sepal length'],df[:50]['sepal width'],label='0')
plt.scatter(df[50:100]['sepal length'],df[50:100]['sepal width'],label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
toc = timer()
print('运行时间：',toc-tic)

在这里插入图片描述
运行时间： 1.1410311999999294

2.对偶形式

在这里插入图片描述

以下是对偶模型的步骤
在这里插入图片描述

代码如下：

# 感知机的对偶形式

class Model2:
    def __init__(self,X):
        self.a = np.zeros(X.shape[0])
        self.l = 0.1
        self.g = np.dot(X,X.T)
        self.b = 0
        
    # 梯度下降法
    def train(self,X_train,y_train):
        flag = False
        while not flag:
            wrong_count = 0
            for d in range(len(X_train)):
                X = X_train[d]
                y = y_train[d]
                sum = 0
                for j in range(len(X_train)):
                    sum += self.a[j]*y_train[j]*self.g[j,d]
                    
                if y*(sum+self.b)<=0:
                    self.a[d]+=self.l
                    self.b +=self.l*y
                    wrong_count+=1
            if wrong_count == 0:
                flag = True
                self.w = np.zeros(X_train.shape[1])
                for i in range(len(X_train)):
                    self.w += self.a[i]*y_train[i]*X_train[i]
                    
                
        return 'it is ok'

tic = timer()
perceptron2 = Model2(X)
perceptron2.train(X,y)

x_points = np.linspace(4,7,10)
print(x_points,perceptron2.b)
y_ = -(perceptron2.w[0]*x_points+perceptron2.b)/perceptron2.w[1] # 分离超平面:::斜率-w[0]/w[1],截距-b/w[1]
plt.plot(x_points,y_)

plt.scatter(df[:50]['sepal length'],df[:50]['sepal width'],label='0')
plt.scatter(df[50:100]['sepal length'],df[50:100]['sepal width'],label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
toc = timer()
print(toc-tic)