线性适应元 - Adaptive Linear Element

最新推荐文章于 2024-08-09 15:23:55 发布

806A

最新推荐文章于 2024-08-09 15:23:55 发布

阅读量855

点赞数

分类专栏：机器学习文章标签：机器学习

机器学习专栏收录该内容

8 篇文章 0 订阅

订阅专栏

•Adaptive Linear Element (ADLINE) VS Perceptron

–When the problem is not linearly separable, perceptron will fail to converge

–ADLINE can overcome this difficulty by finding a best fit approximation to the target.

• We have training pairs (X(k), d(k), k =1, 2, …, K), where K is the number of training samples, the training error specifies the difference between the output of the ALDLINE and the desired target

• The smaller E(W) is, the closer is the approximation

• We need to find W, based on the given training set, that minimizes the error E(W)

The Gradient Descent Rule

–The gradient of E is a vector, whose components are the partial derivatives of E with respect to each of the wi

（E的梯度的每一个维度的值是E相对于相应w的某一维度的偏导数）

–The gradient specifies the direction that produces the speepest increase in E.

– Negative of the vector gives the direction of steepest decrease.

（梯度的方向指明了E的最快上升方向，反向则是最快下降方向）

• The gradient training rule is

h is the training rate

• ADLINE weight updating using gradient descent rule

•Gradient descent training procedure

–Initialise wi to small vales, e.g., in the range of (-1, 1), choose a learning rate, e.g., h = 0.2

–Until the termination condition is met, Do

Stochastic (Incremental) Gradient Descent

•Also called online mode, Least Mean Square (LMS), Widrow-Hoff, and Delta Rule

–Initialise wi to small vales, e.g., in the range of (-1, 1), choose a learning rate, e.g., h = 0.01 (should be smaller than batch mode)

–Until the termination condition is met, Do

（batch mode 与 online mode ：batch mode 将所有的训练集训练一遍（计算一遍）之后再更新W；online mode 对于每一个训练样本都更新一次W。）

•Training is an iterative process; training samples will have to be used repeatedly for training

•Assuming we have K training samples [(X(k), d(k)), k=1, 2, …, K]; then an epoch is the presentation of all K sample for training once

•

–First epoch: Present training samples: (X(1), d(1)), (X(2), d(2)), … (X(K), d(K))

–Second epoch: Present training samples: (X(K), d(K)), (X(K-1), d(K-1)), … (X(1), d(1))

–Note the order of the training sample presentation between epochs can (and should normally) be different.

•

•Normally, training will take many epochs to complete

Termination of Training

• To terminate training, there are normally two ways

– When a pre-set number of training epochs is reached

– When the error is smaller than a pre-set value

806A

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
线性适应元 - Adaptive Linear Element

线性适应元 - Adaptive Linear Element
复制链接

扫一扫

专栏目录