About Crash Course In Multilayer Perceptrons

Artificial neural networks are a fascinating area of study, although they can be intimidating when just getting started. There is a lot of specialized terminology used when describing the data structures and algorithms used in the field. In this lesson you will get a crash course in the terminology and processes used in the field of Multilayer Perceptron artificial neural networks. After completing this lesson you will know:

  • The building blocks of neural networks including neurons,weights and activation functions.
  • How the building blocks are used in layers to create networks.
  • How networks are trained from example data.

6.1 Crash Course Overview

We are going to cover a lot of ground in this lesson. Here is an idea of what is ahead:

  1. Multiplayer Perceptrons
  2. Neurons,Weights and Activations.
  3. Networks of Neurons
  4. Training Networks.

6.2 Multiplayer Perceptrons

Artifical neural networks is often just called Neural Networks or Multiplayer Perceptrons after perhaps the most useful type of neural network.

A Perceptron is a single neuron model that was a precursor to larger neural networks. It is a field of study that investigates how simple models of biological brains can be used to solve difficult computational tasks like the predictive modeling tasks we see in machine learning.

6.3 Neurons

The building block for neural networks are artificial neurons .

 6.3.1 Neuron Weights

Each neuron has a bias which can be thought of as an input that always has the value 1.0 and it too must be weighted.

Weights are often initialized to small random values, such as values in the range 0 to 0.3, although more complex initialization schemes can be used.Like linear regression, larger weights indicate increased complexity and fragility of the model. It is desirable to keep weights in the network small and regularization techniques can be used.

6.3.2 Activation

The weighted inputs are summed and passed through an activation function,sometimes called a transfer function.

An activation function is a simple mapping of summed weighted input to the output of the neuron is activated and the strength of the output signal.

Nonlinear functions like the logistic function also called the sigmoid function were used that output a value between 0 and 1 with an s-shaped distribution, and the hyperbolic tangent function also called Tanh that outputs the same distribution over the range -1 to +1.

6.4 Networks of Neurons

Neurons are arranged into networks of neurons. A row of neurons is called a layer and one network can have multiple layers. The architecture of the neurons in the network is often called the network topology.

 6.4.1 Input or Visible Layers

The bottom layer that takes input your dataset is called the visiable layer,because it is the exposed part of the network.Often a neural network is drawn with a visiable layer with one neuron per input value or column in your dataset.

6.4.2 Hidden Layers

Layers after that input layer called hidden layers ,because they are not directly exposed to the input.. The simplest network structure is to have a single neuron in the hidden layer that directly outputs the value.

6.4.3 Output Layers

The final hidden layer is called the output layer and it is responsible for outputting a value or vector of values that correspond to the format required for the problem.

  • A regression problem may have a single output neuron and the neuron may have no activation function.
  • A binary classification problem may have a single output neuron and use a sigmoid activation function to output a value between 0 and 1 to represent the probability of predicting a value for the primary class. This can be turned into a crisp class value by using a threshold of 0.5 and snap values less than the threshold to 0 otherwise to 1.
  • A multiclass classification problem may have multiple neurons in the output layer, one for each class (e.g. three neurons for the three classes in the famous iris flowers classification problem). In this case a softmax activation function may be used to output a probability of the network predicting each of the class values. Selecting the output with the highest probability can be used to produce a crisp class classification value.

6.5 Training Networks

Once configured, the neural network needs to be trained on your dataset.

6.5.1 Data Preparation

first prepare your data for training on a neural network. Data must be numerical.If you have categorical data, such as a sex attribute with the values male and female, you can convert it to a real-valued representation called a one hot encoding.This same one hot encoding can be used on the output variable in classification problems with more than one class.

6.5.2 Stochastic Gradient Descent(随机梯度下降)

The classical and still preferred training algorithm for neural networks is called stochastic gradient descent.

6.5.3 Weight Updates

The weights in the network can be updated from the errors calculated for each training example and this is called online learning.

Alternatively, the errors can be saved up across all of the training examples and the network can be updated at the end. This is called batch learning and is often more stable.

The amount that weights are updated is controlled by a configuration parameter called the learning rate.It is also called the step size and controls the step or change made to network weights for a given error.

The update equation can be complemented with additional configuration terms that you can set.

  • Momentum is a term that incorporates the properties from the previous weight update to allow the weights to continue to change in the same direction even when there is less error being calculated.
  • Learning Rate Decay is used to decrease the learning rate over epochs to allow the network to make large changes to the weights at the beginning and smaller fine tuning changes later in the training schedule.

6.5.4 Prediction

Once a neural network has been trained it can be used to make predictions. You can make predictions on test or validation data in order to estimate the skill of the model on unseen data.

Predictions are made by providing the input to the network and performing a forward-pass allowing it to generate an output that you can use as a prediction.

6.6 Summary

In this lesson you discovered artificial neural networks for machine learning. You learned:

  • How neural networks are not models of the brain but are instead computational models for solving complex machine learning problems.
  • That neural networks are comprised of neurons that have weights and activation functions.
  • The networks are organized into layers of neurons and are trained using stochastic gradient descent.
  • That it is a good idea to prepare your data before training a neural network model.

6.6.1 Next

You now know the basics of neural network models. In the next section you will develop your very first Multilayer Perceptron model in Keras.

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值