线性适应元 - Adaptive Linear Element

Adaptive Linear Element (ADLINE) VS Perceptron



When the problem is not linearly separable, perceptron will fail to converge
ADLINE can overcome this difficulty by finding a best fit approximation to the target.

We have training pairs (X(k), d(k), k =1, 2, …, K), where K is the number of training samples, the training error specifies the difference between the output of the ALDLINE and the desired target



The smaller E(W) is, the closer is the approximation
We need to find W, based on the given training set, that minimizes the error E(W)


The Gradient Descent Rule


The gradient of E is a vector, whose components are the partial derivatives of E with respect to each of the wi
(E的梯度的每一个维度的值是E相对于相应w的某一维度的偏导数)

The gradient specifies the direction that produces the speepest increase in E.
 Negative of the vector gives the direction of steepest decrease.
(梯度的方向指明了E的最快上升方向,反向则是最快下降方向)
The gradient training rule is

h is the training rate


ADLINE weight updating using gradient descent rule


Gradient descent training procedure

Initialise wi to small vales, e.g., in the range of (-1, 1), choose a learning rate, e.g., h = 0.2

Until the termination condition is met, Do



Stochastic (Incremental) Gradient Descent

Also called online mode, Least Mean Square (LMS), Widrow-Hoff, and Delta Rule

Initialise wi to small vales, e.g., in the range of (-1, 1), choose a learning rate, e.g., h = 0.01 (should be smaller than batch mode)
Until the termination condition is met, Do


(batch mode 与 online mode :batch mode 将所有的训练集训练一遍(计算一遍)之后再更新W;online mode 对于每一个训练样本都更新一次W。)

Training is an iterative process; training samples will have to be used repeatedly for training
Assuming we have K training samples [(X(k), d(k)), k=1, 2, …, K]; then an epoch is the presentation of all K sample for training once
First epoch: Present training samples: (X(1), d(1)), (X(2), d(2)), … (X(K), d(K))
Second epoch: Present training samples: (X(K), d(K)), (X(K-1), d(K-1)), … (X(1), d(1))
Note the order of the training sample presentation between epochs can (and should normally) be different.
Normally, training will take many epochs to complete

Termination of Training 
To terminate training, there are normally two ways

When a pre-set number of training epochs is reached
When the error is smaller than a pre-set value

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值