python感知器
At HSR, I’m currently enrolled in a course about neural networks and machine learning. One of the simplest forms of a neural network model is the perceptron.
在HSR ,我目前正在学习有关神经网络和机器学习的课程。 一个神经网络模型的简单形式是感知 。
背景资料 (Background Information)
A perceptron classifier is a simple model of a neuron. It has different inputs ($x_1$…$x_n$) with different weights ($w_1$…$w_n$).
感知器分类器是神经元的简单模型。 它具有不同权重($ w_1 $…$ w_n $)的不同输入($ x_1 $…$ x_n $)。
The weighted sum $s$ of these inputs is then passed through a step function $f$ (usually a Heaviside step function).
然后,将这些输入的加权总和$ s $传递给阶跃函数$ f $(通常是Heaviside阶跃函数 )。
To make things cleaner, here’s a little diagram:
为了使事情更干净,这是一个小图:
Python! (Python!)
Here’s a simple version of such a perceptron using Python and NumPy. It will take two inputs and learn to act like the logical OR function. First, let’s import some libraries we need:
这是使用Python和NumPy的这种感知器的简单版本。 这将需要两个输入,并学会像逻辑或功能一样工作。 首先,让我们导入一些我们需要的库:
from from random random import import choice
choice
from from numpy numpy import import arrayarray , , dotdot , , random
random
Then let’s create the step function. In reference to Mathematica, I’ll call this function unit_step.
然后让我们创建step函数。 关于Mathematica ,我将其称为unit_step 。
Next we need to map the possible input to the expected output. The first two entries of the NumPy array in each tuple are the two input values. The second element of the tuple is the expected result. And the third entry of the array is a “dummy” input (also called the bias) which is needed to move the threshold (also known as the decision boundary) up or down as needed by the step function. Its value is always 1, so that its influence on the result can be controlled by its weight.
接下来,我们需要将可能的输入映射到预期的输出。 每个元组中NumPy数组的前两个条目是两个输入值。 元组的第二个元素是预期结果。 数组的第三个条目是“虚拟”输入(也称为“偏差”),根据阶跃函数的需要,该阈值可用于上下移动阈值(也称为决策边界)。 它的值始终为1,因此可以通过权重控制其对结果的影响。
training_data training_data = = [
[
(( arrayarray ([([ 00 ,, 00 ,, 11 ]), ]), 00 ),
),
(( arrayarray ([([ 00 ,, 11 ,, 11 ]), ]), 11 ),
),
(( arrayarray ([([ 11 ,, 00 ,, 11 ]), ]), 11 ),
),
(( arrayarray ([([ 11 ,, 11 ,, 11 ]), ]), 11 ),
),
]
]
As you can see, this training sequence maps exactly to the definition of the OR function:
如您所见,此训练序列完全映射到OR函数的定义:
A | 一个 | B | 乙 | A OR B | A或B |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 1个 | 1 | 1个 |
1 | 1个 | 0 | 0 | 1 | 1个 |
1 | 1个 | 1 | 1个 | 1 | 1个 |
Next we’ll choose three random numbers between 0 and 1 as the initial weights:
接下来,我们将选择介于0和1之间的三个随机数作为初始权重:
Now on to some variable initializations. The errors list is only used to store the error values so that they can be plotted later on. If you don’t want to do any plotting, you can just leave it away. The eta variable controls the learning rate. And n specifies the number of learning iterations.
现在进行一些变量初始化。 错误列表仅用于存储错误值,以便以后可以绘制它们。 如果您不想进行任何绘图,则可以将其保留。 eta变量控制学习率。 n指定学习迭代的次数。
errors errors = = []
[]
eta eta = = 0.2
0.2
n n = = 100
100
In order to find the ideal values for the weights w, we try to reduce the error magnitude to zero. In this simple case n = 100 iterations are enough; for a bigger and possibly “noisier” set of input data much larger numbers should be used.
为了找到权重w的理想值,我们尝试将误差幅度减小为零。 在这种简单情况下, n = 100次迭代就足够了; 对于更大且可能更“嘈杂”的输入数据集,应使用更大的数字。
First we get a random input set from the training data. Then we calculate the dot product (sometimes also called scalar product or inner product) of the input and weight vectors. This is our (scalar) result, which we can compare to the expected value. If the expected value is bigger, we need to increase the weights, if it’s smaller, we need to decrease them. This correction factor is calculated in the last line, where the error is multiplied with the learning rate (eta) and the input vector (x). It is then added to the weights vector, in order to improve the results in the next iteration.
首先,我们从训练数据中获得一个随机输入集。 然后我们计算输入和权重向量的点积(有时也称为标量积或内积)。 这是我们的(标量)结果,可以将其与期望值进行比较。 如果期望值较大,则需要增加权重;如果期望值较小,则需要减小权重。 该校正因子在最后一行中计算,其中误差乘以学习率( eta )和输入向量( x )。 然后将其添加到权重向量,以便在下一次迭代中改善结果。
And that’s already everything we need in order to train the perceptron! It has now “learned” to act like a logical OR function:
这已经是我们训练感知器所需的一切! 现在,它已“学习”以充当逻辑或函数:
for for xx , , _ _ in in training_datatraining_data :
:
result result = = dotdot (( xx , , ww )
)
printprint (( "{}: {} -> {}""{}: {} -> {}" .. formatformat (( xx [:[: 22 ], ], resultresult , , unit_stepunit_step (( resultresult )))
)))
[[ 0 0 00 ]: ]: -- 0.0714566687173 0.0714566687173 -> -> 0
0
[[ 0 0 11 ]: ]: 0.829739696273 0.829739696273 -> -> 1
1
[[ 1 1 00 ]: ]: 0.345454042997 0.345454042997 -> -> 1
1
[[ 1 1 11 ]: ]: 1.24665040799 1.24665040799 -> -> 1
1
If you’re interested, you can also plot the errors, which is a great way to visualize the learning process:
如果您有兴趣,也可以绘制错误,这是可视化学习过程的好方法:
It’s easy to see that the errors stabilize around the 60th iteration. If you doubt that the errors are definitely eliminated, you can re-run the training with an iteration count of 500 or more and plot the errors:
不难发现,误差在第60次迭代左右稳定下来。 如果您不确定错误是否已消除,则可以使用500或更多的迭代次数重新运行训练并绘制错误:
You could also try to change the training sequence in order to model an AND, NOR or NOT function. Note that it’s not possible to model an XOR function using a single perceptron like this, because the two classes (0 and 1) of an XOR function are not linearly separable. In that case you would have to use multiple layers of perceptrons (which is basically a small neural network).
您也可以尝试更改训练序列,以对AND,NOR或NOT函数建模。 注意,不可能像这样使用单个感知器对XOR函数建模,因为XOR函数的两个类(0和1)不是线性可分离的。 在这种情况下,您将必须使用多层感知器(基本上是一个小的神经网络)。
结语 (Wrap Up)
Here’s the entire code:
这是完整的代码:
from from random random import import choice
choice
from from numpy numpy import import arrayarray , , dotdot , , random
random
unit_step unit_step = = lambda lambda xx : : 0 0 if if x x < < 0 0 else else 1
1
training_data training_data = = [
[
(( arrayarray ([([ 00 ,, 00 ,, 11 ]), ]), 00 ),
),
(( arrayarray ([([ 00 ,, 11 ,, 11 ]), ]), 11 ),
),
(( arrayarray ([([ 11 ,, 00 ,, 11 ]), ]), 11 ),
),
(( arrayarray ([([ 11 ,, 11 ,, 11 ]), ]), 11 ),
),
]
]
w w = = randomrandom .. randrand (( 33 )
)
errors errors = = []
[]
eta eta = = 0.2
0.2
n n = = 100
100
for for i i in in xrangexrange (( nn ):
):
xx , , expected expected = = choicechoice (( training_datatraining_data )
)
result result = = dotdot (( ww , , xx )
)
error error = = expected expected - - unit_stepunit_step (( resultresult )
)
errorserrors .. appendappend (( errorerror )
)
w w += += eta eta * * error error * * x
x
for for xx , , _ _ in in training_datatraining_data :
:
result result = = dotdot (( xx , , ww )
)
printprint (( "{}: {} -> {}""{}: {} -> {}" .. formatformat (( xx [:[: 22 ], ], resultresult , , unit_stepunit_step (( resultresult )))
)))
If you have any questions, or if you’ve discovered an error (which is easily possible as I’ve just learned about this stuff), feel free to leave a comment below.
如果您有任何疑问,或者您发现了一个错误(这很容易实现,因为我刚刚了解了这些东西),请在下面发表评论。
翻译自: https://www.pybloggers.com/2013/03/programming-a-perceptron-in-python/
python感知器