机器学习笔记 1 —— Perceptron

最新推荐文章于 2022-10-21 16:10:56 发布

我有两颗糖

最新推荐文章于 2022-10-21 16:10:56 发布

阅读量422

点赞数 1

分类专栏：机器学习

本文链接：https://blog.csdn.net/qq_41140138/article/details/118561067

版权

机器学习 python

机器学习专栏收录该内容

7 篇文章 0 订阅

订阅专栏

1. 感知机

模型
感知机是根据输入实例的特征向量 $x$ 对其进行二类分类的线性分类模型：

$f(x)=\operatorname{sign}(w \cdot x+b)$

感知机模型对应于输入空间（特征空间）中的分离超平面 $\cdot x+b=0$

策略
感知机学习的策略是极小化损失函数：

$\min _{w, b} L(w, b)=-\sum_{x_{i} \in M} y_{i}\left(w \cdot x_{i}+b\right)$

损失函数对应于误分类点到分离超平面的总距离。

方法
感知机学习算法是基于随机梯度下降法的对损失函数的最优化算法，有原始形式和对偶形式。当训练数据集线性可分时，感知机学习算法是收敛的。当训练数据集线性可分时，感知机学习算法存在无穷多个解，其解由于不同的初值或不同的迭代顺序而可能有所不同。

2. 二分类模型

模型
$sign(w\cdot x + b)$

$\operatorname{sign}(x)=\left\{\begin{array}{ll}{+1,} & {x \geqslant 0} \\ {-1,} & {x<0}\end{array}\right.$

策略
给定训练集：

$T=\left\{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \cdots,\left(x_{N}, y_{N}\right)\right\}$

定义感知机的损失函数 $L (w, b)$ ：

$b)=-\sum_{x_{i} \in M} y_{i}\left(w \cdot x_{i}+b\right)$

算法

随即梯度下降法 Stochastic Gradient Descent

随机抽取一个误分类点使其梯度下降：

$\left\{\begin{array}{ll}{w = w + \eta y_{i}x_{i}} \\ {b = b + \eta y_{i}}\end{array}\right.$

当实例点被误分类，即位于分离超平面的错误侧，则调整 $w$ , $b$ 的值，使分离超平面向该无分类点的一侧移动，直至误分类点被正确分类

3. 感知机的实现

首先把我们需要的库导入；

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

3.1 数据集预处理

可以使用 sklearn 库中的 iris 鸢尾花数据集，使用方法如下：

iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['label'] = iris.target	# 添加一列
df.columns = [
	'sepal length',
	'sepal width',
	'petal length',
	'petal width',
	'label'
]
print(df)
print(df.shape)	# (150, 5)
# 统计 label 的每种值的数量
print(df.label.value_counts())

# 绘制散点图
plt.scatter(df[:50]['sepal length'], 
	df[:50]['sepal width'], label='0')
plt.scatter(df[50:100]['sepal length'], 
	df[50:100]['sepal width'], label='1')
plt.scatter(df[100:150]['sepal length'],
	df[100:150]['sepal width'], label='2')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
plt.show()

输出结果为：

     sepal length  sepal width  petal length  petal width  label
0             5.1          3.5           1.4          0.2      0
1             4.9          3.0           1.4          0.2      0
2             4.7          3.2           1.3          0.2      0
..            ...          ...           ...          ...    ...
147           6.5          3.0           5.2          2.0      2
148           6.2          3.4           5.4          2.3      2
149           5.9          3.0           5.1          1.8      2
[150 rows x 5 columns]

(150, 5)

0    50
1    50
2    50
Name: label, dtype: int64

得到的结果如图：

利用这些数据我们可以得到训练数据的 X 和 y：

data = np.array(df.iloc[:100, [0, 1, -1]])
X, y = data[:, :-1], data[:, [-1]]
y = np.array([1 if i == 1 else -1 for i in y])

此时，鸢尾花样本有 100 个，每个样本有花瓣长度和花瓣宽度两个特征，分类的类别有 2 中，用 +1 和 -1 表示。

3.2 构建模型

根据上面的理论知识，可以利用随机梯度下降算法构建二类分类模型，其中 fit() 为拟合函数：

class Model:
	def __init__(self):
		self.W = np.zeros(X.shape[1], dtype=np.float32)
		self.b = 0
		self.lr = 0.1

	def sign(self, X, W, b):
		y = np.dot(X, W) + b
		return 1 if y > 0 else -1

	def fit(self, X_train, y_train):
		success = False
		while not success:
			wrong_count = 0
			for i in range(len(X_train)):
				X = X_train[i]
				y = y_train[i]
				if y * self.sign(X, self.W, self.b) < 0:
					self.W += self.lr * np.dot(X, y)
					self.b += y * self.lr
					wrong_count += 1
			if wrong_count == 0:
				success = True
		return 'Perception Model'

	def score(self):
		pass

3.3 预测分类

使用先前的模型先对数据训练，将分类结果绘制出来：

perception = Model()
res = perception.fit(X, y)
print(res)	# Perception Model


x_ = np.linspace(4, 7, 10)
y_ = -(perception.W[0] * x_ + perception.b) / perception.W[1]
plt.plot(x_, y_)

plt.plot(data[:50, 0], data[:50, 1], 'bo', color='blue', label='0')
plt.plot(data[50:100, 0], data[50:100, 1], 'bo', color='orange', label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
plt.show()

结果如下：

4. sklearn

sklearn 的 Perception 模型（点击蓝色的字查看参数）为基于随机梯度下降 SGD 的分类模型：

下面使用 sklearn 的 Perceptron 模型：

from sklearn.linear_model import Perceptron

clf = Perceptron(
	fit_intercept=True,
	max_iter=1000,
	shuffle=True,
	tol=None)
clf.fit(X, y)

W = clf.coef_[0]
b = clf.intercept_

x_ = np.linspace(4, 7, 10)
y_ = -(W[0] * x_ + b) / W[1]
plt.plot(x_, y_)

plt.plot(data[:50, 0], data[:50, 1], 'bo', color='blue', label='0')
plt.plot(data[50:100, 0], data[50:100, 1], 'bo', color='orange', label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
plt.show()