统计学习方法笔记（二）感知机

最新推荐文章于 2022-01-05 14:54:52 发布

困宝宝是智障

最新推荐文章于 2022-01-05 14:54:52 发布

阅读量270

点赞数

本文链接：https://blog.csdn.net/crayz/article/details/77857013

版权

在看这一章之前我以为不就是个感知机，可以直接跳过这一章了。但其实仔细看完还是有一些原来不知道的知识。

先从上一章介绍的“模型、策略、算法”的框架来简要总结一下感知机。

模型

假设输入空间是X∈Rn（表示n维输入特征），输出空间是Y∈{+1, -1}（表示实例的类别）。

由输入空间到输出空间的如下函数 f(x) = sign(w·x + b) 称为感知机。其中w和b为感知机模型参数。

策略

这里的学习策略推导也就是损失函数的推导。

输入空间Rn中任意一点X0到决策面 w·x + b = 0 的距离为1/||w||·|w·x0 + b|. (点到直线的距离公式)

而对于误分类的数据(xi,yi)来说，-yi(w·xi + b) > 0，所有误分类点的总距离为 -1/||w||∑yi(w·xi + b) .

不考虑1/||w||，就得到感知机学习的损失函数： L(w,b) = -∑yi(w·xi + b) .

感知机的学习策略即在假设空间中选取使损失函数最小的模型参数w,b.

算法

1.原始形式算法：

由随机梯度下降法分别优化w与b.

对w求偏导，▽wL(w,b) = -∑yixi

对b求偏导，▽bL(w,b) = -∑yi

设m为学习率，则根据随机梯度下降法，选取一个误分类点：

w <- w+m·yixi

b <- b+m·yi

2.对偶形式算法：

所谓对偶问题，指根据算法的原始形式，将原模型 f(x) = sign(w·x + b) 化为等价的f(x) = sign(∑αjyjxj·x + b) .

其中α向量指每一个样本被误分类的次数。如α1指x1在迭代过程中被误分类的次数。求解w的过程被化为了求解α的过程。

为了优化计算过程，在迭代中还引入了gram矩阵存储xi,xj的内积。

算法实现：

本文在python上实现了一个原始形式的感知机。数据集采用鸢尾花数据集。

代码用到的库：

import numpy as np
import pandas as pd#用pandas读取数据
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

感知机类：

class Perceptron(object):
    def __init__(self, eta):
        self.eta = eta
        self._w = np.random.random(size = 3) * 10
    
    def net_input(self,X):
        """calculate net input"""
        return np.dot(X,self._w[1:]) + self._w[0]#计算向量点乘
     
    def predict(self,X):#预测类别标记
        """return class label after unit step"""
        return np.where(self.net_input(X) >= 0.0,1,-1) #整个矩阵批量替换（用于描点画图）
        
    def fit(self, X, y):
        while 1:
            exist_error = 0;
            for xi, yi in zip(X,y):
                if yi != self.predict(xi) :
                    print(xi,yi)
                    self._w[1:] += self.eta * yi * xi
                    self._w[0] += self.eta * yi
                    exist_error = 1;
                    break
            if exist_error == 0:
                return self

主程序（用来读取数据集、画图）

df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data',header=None)#读取数据还可以用request这个包
print(df.tail())#输出最后五行数据，看一下Iris数据集格式
 
"""抽取出前100条样本，这正好是Setosa和Versicolor对应的样本，我们将Versicolor
对应的数据作为类别1，Setosa对应的作为-1。对于特征，我们抽取出sepal length和petal
length两维度特征，然后用散点图对数据进行可视化"""  
 
y = df.iloc[0:100,4].values
y = np.where(y == 'Iris-setosa',-1,1)
X = df.iloc[0:100,[1,3]].values
plt.scatter(X[:50,0],X[:50,1],color = 'red',marker='o',label='setosa')
plt.scatter(X[50:100,0],X[50:100,1],color='blue',marker='x',label='versicolor')
plt.xlabel('petal length')
plt.ylabel('sepal lenght')
plt.legend(loc='upper left')
plt.show()
 
#train our perceptron model now
#为了更好地了解感知机训练过程，我们将每一轮的误分类
#数目可视化出来，检查算法是否收敛和找到分界线
ppn=Perceptron(eta=0.1)
ppn.fit(X,y)

print(ppn._w)
#画分界线超平面
def plot_decision_region(X,y,classifier,resolution=0.02):
    #setup marker generator and color map
    markers=('s','x','o','^','v')
    colors=('red','blue','lightgreen','gray','cyan')
    cmap=ListedColormap(colors[:len(np.unique(y))])
     
    #plot the desicion surface
    x1_min,x1_max=X[:,0].min()-1,X[:,0].max()+1
    x2_min,x2_max=X[:,1].min()-1,X[:,1].max()+1              
     
    xx1,xx2=np.meshgrid(np.arange(x1_min,x1_max,resolution),
                        np.arange(x2_min,x2_max,resolution))
    Z=classifier.predict(np.array([xx1.ravel(),xx2.ravel()]).T)
    Z=Z.reshape(xx1.shape)
    print(Z)
     
    plt.contour(xx1,xx2,Z,alpha=0.4,cmap=cmap)
    plt.xlim(xx1.min(),xx1.max())
    plt.ylim(xx2.min(),xx2.max())
     
    #plot class samples
    for idx,cl in enumerate(np.unique(y)):
        plt.scatter(x=X[y==cl,0],y=X[y==cl,1],alpha=0.8,c=cmap(idx), marker=markers[idx],label=cl)
 
plot_decision_region(X,y,classifier=ppn)
plt.xlabel('sepal length [cm]')
plt.ylabel('petal length [cm]')
plt.legend(loc='upper left')
plt.show()

困宝宝是智障

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
统计学习方法笔记（二）感知机

在看这一章之前我以为不就是个感知机，可以直接跳过这一章了。但其实仔细看完还是有一些原来不知道的知识。先从上一章介绍的“模型、策略、算法”的框架来简要总结一下感知机。模型假设输入空间是X∈Rn（表示n维输入特征），输出空间是Y∈{+1, -1}（表示实例的类别）。由输入空间到输出空间的如下函数 f(x) = sign(w·x + b) 称为感知机。其中w和b为感知机模型参数。
复制链接

扫一扫