statistical machine learning 02 Perceptron

《统计学习方法》笔记—感知机

写在前面

Content List

  • statistical machine-learning

    • 1.1. learning object data

    • 1.2. main machine-learning

1. 感知机模型

模型适用前提

感知机 能够解决的问题首先要求 feature_space 线性可分,再者是二类分类,即将样本分为 {+1, -1} 两类

input_space to output_space 的函数:Perceptron_Function

w 和 b 为 model 参数,w为权值(weight),b为偏置(bias)

<p align="center">图1</p>

感知机模型的 hypothesis_space 是定义在feature_space中的所有线性分类模型,即函数集合 $$ {f|f(x) = w·x + b} $$

感知机的定义中,线性方程 $w·x + b = 0$ 对应于问题空间中的一个超平面S,位于这个超平面两侧的样本分别被归为两类,例如下图,红色作为一类,蓝色作为另一类,它们的特征很简单,就是它们的坐标

<p align="center">11142837-31e4844d63c2478e8f978af1ebd59512.png

作为监督学习的一种方法,感知机学习由训练集求得感知机模型,即求得模型参数w,b,这里x和y分别是特征向量和类别(也称为目标)。基于此,感知机模型可以对新的输入样本进行分类。

2. 感知机学习策略

感知机是一个简单的二类分类的线性分类模型,要求我们的样本是线性可分的,什么样的样本是线性可分的呢?举例来说,在二维平面中,可以用一条直线将+1类和-1类完美分开,那么这个样本空间就是线性可分的。如图1就是线性可分的,图2中的样本就是线性不可分的,感知机就不能处理这种情况。因此,在本章中的所有问题都基于一个前提,就是问题空间线性可分。

<p align="center">11142837-b7136dbe52314559b5db681519065d4c.png

为说明问题,假设数据集 $$
T = { (x_1, y_1), (x_2, y_2), ... , (x_N, y_N) }
$$ 中 所有 $y_i = +1$ 的实例 i 有 per03, 对 所有 $y_i = -1$ 的实例 i 有 per04

这里先给出 input_space $R^n$ 中任一点 $x_0$ 到超平面 $S$ 的距离:

$$
frac{1}{||w||} |w bullet x_0 + b|
$$

这里 $||w||$ 是 $w$ 的 $L_2$ 范数。

对于误分类的数据 $(x_i, y_i)$ ,根据我们之前的假设,有

$$
-y_i (w bullet x_i + b) > 0
$$

因此误分类点到超平面S的距离可以写作 :

$$
-frac{1}{||w||} y_i (w bullet x_i + b)
$$

假设超平面S的误分类点集合为M,那么所有误分类点到超平面S的总距离为 :

$$
-frac{1}{||w||}sum_{x_i in M } y_i (w bullet x_i + b)
$$

3. 感知机学习算法

3.1 原始形式

3.2 算法收敛性

3.3 对偶形式

4. 小结

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Pratap Dangeti, "Statistics for Machine Learning" English | ISBN: 1788295757 | 2017 | EPUB | 311 pages | 12 MB Key Features Learn about the statistics behind powerful predictive models with p-value, ANOVA, F-statistics. Implement statistical computations programmatically for supervised and unsupervised learning through K-means clustering. Master the statistical aspect of machine learning with the help of this example-rich guide in R & Python. Book Description Complex statistics in machine learning worries a lot of developers. Knowing statistics helps in building strong machine learning models that are optimized for a given problem statement. This book will teach you all it takes to perform complex statistical computations required for machine learning. You will gain information on statistics behind supervised learning, unsupervised learning, reinforcement learning, and more. You will see real-world examples that discuss the statistical side of machine learning and make you comfortable with it. You will come across programs for performing tasks such as model, parameters fitting, regression, classification, density collection, working with vectors, matrices, and more.By the end of the book, you will understand concepts of required statistics for Machine Learning and will be able to apply your new skills to any sort of industry problems. What you will learn Understanding Statistical & Machine learning fundamentals necessary to build models Understanding major differences & parallels between statistics way of solving problem & machine learning way of solving problem Know how to prepare data and "feed" the models by using the appropriate machine learning algorithms from the adequate R & Python packages Analyze the results and tune the model appropriately to his or her own predictive goals Understand concepts of required statistics for Machine Learning Draw parallels between statistics and machine learning Understand each component of machine learning models and see impact of changing them

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值