About Backpropagation (反向传播法)

The backpropagation is the classical feedforward artificial neural network. It is used to train large deep learning networks.

After this part, you will know:

  • How to forward-propagate(正向传播) an input to calculate an output
  • How to backpropagate(反向传播) error and train a network
  • How to apply the backpropagation algorithm to a real-world predictive modeling problem.

1.1 Description

  • A brief introduction to the Backpropagation Algorithm
  • The Wheat Seeds dataset

1.1.1 Backpropagation Algorithm

The Backpropagation algorithm is a supervised learning method for multilayer feedforward networks from the field of Artificial Neural Networks. Feedforward neural networks are inspired by the information processing of one or more neural cells.called a neuron.

The principle of the backpropagation approach is to model a given function by modifying internal weightings of input signals to produce an expected output signal. The backpropagation algorithm is a method for training the weights in a multilayer feedforward neural network.

A standard network structure :

  • input layer
  • hidden layer
  • output layer

Backpropagation can be used for both classification and regression problems.

1.1.2 Wheat Seeds Dataset

This dataset involves the prediction of the species of wheat seeds . The baseline performance on the problem is approximately 28% . the filename seeds_dataset.csv. The dataset is in tab-separated format, so must convert it to CSV using a text editior or a spreadsheet program.

1.2 Tutorial

This tutorial is broken down into 6 parts:

  • Initialize Network
  • Forward-Propagate
  • Backpropagate Error
  • Train Network
  • Predict
  • Wheat Seeds Case Study

1.2.1 Initialize Network

the creation of a new network ready for training. Each neuron has a set of weights that need to be maintained. One weight for each input connection and an additional weights for the bias.use a dictionary to represent each neuron and store properties.

Below is a function named initialize_network() that creates a new neural network ready for training.

accpet three parameters:

  • the number of inputs
  • the number of neurons to have in the hidden layer
  • the number of outputs
# Function to Initialize a Multilayer Perceptron Network
# Initialize a network
def initialize_network(n_inputs, n_hidden, n_outputs):
    network = list()
    hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in range(n_hidden)]
    network.append(hidden_layer)
    output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in range(n_outputs)]
    network.append(output_layer)
    return network

Below is a complete example that creates a small network.

# Example of Initializing a Multilayer Perceptron Network
# Example of initializing a network
from random import seed
from random import random

# Initialize a network
def initialize_network(n_inputs,n_hidden,n_outputs):
    network = list()
    hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in range(n_hidden)]
    network.append(hidden_layer)
    output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in range(n_outputs)]
    network.append(output_layer)
    return network

# Test initializing a network
seed(1)
network = initialize_network(2, 1, 2)
for layer in network:
    print(layer)

Running the example, you can see that the code prints out each layer one by one. You can see the hidden layer has one neuron with 2 input weights plus the bias. The output layer has 2 neurons, each with 1 weight plus the bias.

1.2.2 Forward-Propagate

We can break-propagation down into three parts:

  1. Neuron Activation
  2. Neuron Transfer
  3. Forward-Propagation

Neuron Activation

Neuron activation is calculated as the weighted sum of the inputs.Much like linear regression.

                                      activation = bias + \sum_{i=1}^{n}weight_{i}\times input_{i}

weight is a network weight

input is an input value

i is the index of a weight or an input

bias is a special weight that has no input to multiply with .

Below is an implementation of this in a function named activate()

# Calculate neuron activation for an input
def activate(weights, inputs):
    activation = weights[-1]
    for i in range(len(weights)-1):
        activation += weights[i] * inputs[i]
    return activation

How to use the neuron activation

Neuron Transfer

Once a neuron is actived,we need to transfer the activation to see what the neuron output actually is .

Different transfer functions can be used.

  • It is traditional to use the sigmod activation function
  • use the tanh(hyperbolic tangent) function to transfer outputs.

we can transfer an activation function using the sigmoid function as follows:

where e is the base of the natural logarithms(Euler's number).Below is a function named transfer() that implements the sigmoid equation.

# Transfer neuron activation
def transfer(activation):
    return 1.0 / (1.0 + exp(-activation))

Forward-Propagation

Forward-propagating an input is straightforward.Below is a function named forward_propagate() that implement the forwarding-propagation for a row data from our  dataset with our neural network.The function returns the outputs from the last layer also called the output layer.

# Forward-propagate input to a network output
def forward_propagate(network, row):
    inputs = row
    for layer in network:
        new_inputs = []
        for neuron in layer:
            activation = activate(neuron['weights'],inputs)
            neuron['output'] = transfer(activation)
            new_inputs.append(neuron['output'])
        inputs = new_inputs
    return inputs

# Example of Forward-Propagating an Input Through a Network
# Example of forward propagating input
from math import exp

# Calculate neuron activation for an input
def activate(weights,inputs):
    activation = weights[-1]
    for i in range(len(weights)-1):
        activation += weights[i] * inputs[i]
    return activation

# Transfer neuron activation
def transfer(activation):
    return 1.0 / (1.0 + exp(-activation))

# Forward propagate input to a network output
def forward_propagate(network, row):
    inputs = row
    for layer in network:
        new_inputs = []
        for neuron in layer:
            activation = activate(neuron['weights'],inputs)
            neuron['output'] = transfer(activation)
            new_inputs.append(neuron['output'])
        inputs = new_inputs
    return inputs

# test forward propagation
network = [[{'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}],
[{'weights': [0.2550690257394217, 0.49543508709194095]}, {'weights':[0.4494910647887381, 0.651592972722763]}]]
row = [1, 0, None]
output = forward_propagate(network, row)
print(output)

Sample Output from Forward-Propagate Input Through a Network.

1.2.3 Backpropagate Error

  1. Transfer Derivative
  2. Error Backpropagation

Transfer Derivative

Given an output value from a neuron, we need to calculate it's slope.We are using the sigmoid transfer function, the derivative of which can be calculated as follows:

                                           derivative = output \times (1.0 - output)

Error Backpropagation

                        error = (expected - output)\times transfer\_derivative(output)

expected is the expected output value for the neuron, output is the output value for the neuron and transfer_derivative() calculates the slope of the neuron's output layer.The  expected value is the class value itself.

The backpropagated error signal is accumulated and then used to determine the error for the neuron in the hidden layer.as follows:

                     error = (weight_{k}\times error_{j})\times transfer_derivative(output)

error_{j} is the error signal from the jth neuron in the output layer.

weight_{k} is the weight that connects the kth neuron to the current and output is the output for the current neuron.

Below is a function backward_propagate_error() that implements this procedure.

# Backpropagate error and store in neurons
def backward_propagate_error(network, expected):
    for i in reversed(range(len(network))):
        layer = network[i]
        errors = list()
        if i != len(network)-1:
            for j in range(len(layer)):
                error = 0.0
                for neuron in network[i + 1]:
                    error += (neuron['weights'][j] * neuron['delta'])
                errors.append(error)
        else:
            for j in range(len(layer)):
                neuron = layer[j]
                errors.append(expected[j] - neuron['output'])
            for j in range(len(layer)):
                neuron = layer[j]
                neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])

we define a fixed neural network with output values and backpropagate an expected output pattern. The complete example is listed below:

# Example of backpropagating error

# Calculate the derivative of an neuron output
def transfer_derivative(output):
    return output * (1.0 - output)

# Backpropagate error and store in neurons
def backward_propagate_error(network,expected):
    for i in reversed(range(len(network))):
        layer = network[i]
        errors = list()
        if i != len(network)-1:
            for j in range(len(layer)):
                error = 0.0
                for neuron in network[i + 1]:
                    error += (neuron['weights'][j]*neuron['delta'])
                errors.append(error)
        else:
            for j in range(len(layer)):
                neuron = layer[j]
                errors.append(expected[j] - neuron['output'])
        for j in range(len(layer)):
            neuron = layer[j]
            neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])

# test backpropagation of error
network = [[{'output': 0.7105668883115941, 'weights': [0.13436424411240122,0.8474337369372327, 0.763774618976614]}],
[{'output': 0.6213859615555266, 'weights': [0.2550690257394217, 0.49543508709194095]},
{'output': 0.6573693455986976, 'weights': [0.4494910647887381, 0.651592972722763]}]]

expected = [0, 1]
backward_propagate_error(network, expected)
for layer in network:
    print(layer)

1.2.4 Train Network

  • Update Weights
  • Train Network

Update Weights

Network weights are updated as follows:

                             weight = weight + learning\_rate \times error \times input

weight  is a given weight

learning_rate is a parameter that you must specify

error is the error calculated by the backpropagation procedure for the neuron

input is the input value that caused the error.

Below is a function named update_weights() that updates the weights for a network given an input row of data.

# Update network weights with error
def update_weights(network,row,l_rate):
    for i in range(len(network)):
        inputs = row[:-1]
        if i != 0:
            inputs = [neuron['output'] for neuron in network[i-1]]
        for neuron in network[i]:
            for j in range(len(inputs)):
                neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
            neuron['weights'][-1] += l_rate * neuron['delta']

Train Network

# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch,n_outputs):
    for epoch in range(n_epoch):
        sum_error = 0
        for row in train:
            outputs = forward_propagate(network,row)
            expected = [0 for i in range(n_outputs)]
            expected[row[-1]] = 1
            sum_error += sum([(expected[i]-output[i])**2 for i in range(len(expected))])
            backward_propagate_error(network,expected)
            update_weights(network,row,l_rate)
        print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate,sum_error))

Below is a small contrived dataset that we can use to test out training our neural network.

X1            X2             Y
2.7810836     2.550537003    0
1.465489372   2.362125076    0
3.396561688   4.400293529    0
1.38807019    1.850220317    0
3.06407232    3.005305973    0
7.627531214   2.759262235    1
5.332441248   2.088626775    1
6.922596716   1.77106367     1
8.675418651   -0.242068655   1
7.673756466   3.508563011    1
# Example of training a network by backpropagation
from math import exp
from random import seed
from random import random

# Initialize a network
def initialize_network(n_inputs, n_hidden,n_outputs):
    network = list()
    hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in range(n_hidden)]
    network.append(hidden_layer)
    output_layer = [{'weights':[random() for i in range(n_hidden+ 1)]} for i in range(n_outputs)]
    network.append(output_layer)
    return network

# Calculate neuron activation for an input
def activate(weights, inputs):
    activation = weights[-1]
    for i in range(len(weights)-1):
        activation += weights[i] * inputs[i]
    return activation

# Transfer neuron activation
def transfer(activation):
    return 1.0/ (1.0 + exp(-activation))

# Forward propgate input to a network output
def forward_propagate(network, row):
    inputs = row
    for layer in network:
        new_inputs = []
        for neuron in layer:
            activation = activate(neuron['weights'],inputs)
            neuron['output'] = transfer(activation)
            new_inputs.append(neuron['output'])
        inputs = new_inputs
    return inputs

# Calculate the derivative of an neuron output
def transfer_derivative(output):
    return output * (1.0 - output)

# Backpropagate error and store in neurons
def backward_propagate_error(network,expected):
    for i in reversed(range(len(network))):
        layer = network[i]
        errors = list()
        if i != len(network)-1:
            for j in range(len(layer)):
                error = 0.0
                for neuron in network[i+1]:
                    error += (neuron['weights'][j] * neuron['delta'])
                errors.append(error)
        else:
            for j in range(len(layer)):
                neuron = layer[j]
                errors.append(expected[j] - neuron['output'])
        for j in range(len(layer)):
            neuron = layer[j]
            neuron['delta'] = errors[j]* transfer_derivative(neuron['output'])

# Update network weights with error
def update_weights(network, row, l_rate):
    for i in range(len(network)):
        inputs = row[:-1]
        if i != 0:
            inputs = [neuron['output'] for neuron in network[i - 1]]
        for neuron in network[i]:
            for j in range(len(inputs)):
                neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
            neuron['weights'][-1] += l_rate * neuron['delta']
            
# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch,n_outputs):
    for epoch in range(n_epoch):
        sum_error = 0
        for row in train:
            outputs = forward_propagate(network, row)
            expected = [0 for i in range(n_outputs)]
            expected[row[-1]] = 1
            sum_error += sum([(expected[i]-outputs[i])**2 for i in range(len(expected))])
            backward_propagate_error(network,expected)
            update_weights(network,row,l_rate)
        print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))
        
# Test training backprop algorithm
seed(1)
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]

n_inputs = len(dataset[0]) - 1
n_outputs = len(set([row[-1] for row in dataset]))
network = initialize_network(n_inputs,2,n_outputs)
train_network(network,dataset, 0.5,20,n_outputs)
for layer in network:
    print(layer)

 

1.2.5 Predict

Below is a function named predict() that implements this procedure.

# Make a prediction with a network
def predict(network, row):
    outputs = forward_propagate(network,row)
    return outputs.index(max(outputs))

 The complete example is listed below:

# Example of making predictions
from math import exp

# Calculate neuron activation for an input
def activate(weights,inputs):
    activation = weights[-1]
    for i in range(len(weights)-1):
        activation += weights[i] * inputs[i]
    return activation
# Transfer neuron activation
def transfer(activation):
    return 1.0 / (1.0 + exp(-activation))

# Forward propagate input to a network output
def forward_propagate(network, row):
    inputs = row
    for layer in network:
        new_inputs = []
        for neuron in layer:
            activation = activate(neuron['weights'],inputs)
            neuron['output'] = transfer(activation)
            new_inputs.append(neuron['output'])
        inputs = new_inputs
    return inputs

# Make a prediction with a network
def predict(network, row):
    outputs = forward_propagate(network, row)
    return outputs.index(max(outputs))

# Test making predictions with the network
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
network = [[{'weights': [-1.482313569067226, 1.8308790073202204, 1.078381922048799]},{'weights': [0.23244990332399884, 0.3621998343835864, 0.40289821191094327]}],
[{'weights': [2.5001872433501404, 0.7887233511355132, -1.1026649757805829]}, {'weights':[-2.429350576245497, 0.8357651039198697, 1.0699217181280656]}]]
for row in dataset:
    prediction = predict(network, row)
    print('Expected=%d, Got=%d' % (row[-1], prediction))

1.2.6 Wheat Seeds Case Study

load_csv() to load the file

str_column_to_float() to convert string numbers to floats

str_column_to_int() to convert the class column to integer values

evaluate_algorithm() to evaluate the algorithm with cross-validation

accuracy_metric() to calculate the accuracy of predictions

back_propagation() was developed to manage the application of the Backpropagation algorithm,first initializing a network,training it on the training dataset and then using the trained network to make predictions on a test dataset.

The complete example is listed below:

# Backprop on the Seeds Dataset
from random import seed
from random import randrange
from random import random
from csv import reader
from math import exp

# Load a CSV file
def load_csv(filename):
    dataset = list()
    with open(filename,'r') as file:
        csv_reader = reader(file)
        for row in csv_reader:
            if not row:
                continue
            dataset.append(row)
    return dataset

# Convert string column to float
def str_column_to_float(dataset, column):
    for row in dataset:
        row[column] = float(row[column].strip())
        
# Convert string column to integer
def str_column_to_int(dataset, column):
    class_values = [row[column] for row in dataset]
    unique = set(class_values)
    lookup = dict()
    for i,value in enumerate(unique):
        lookup[value] = i
    for row in dataset:
        row[column] = lookup[row[column]]
    return lookup

# Find the min and max values for each column
def dataset_minmax(dataset):
    return [[min(column),max(column)] for column in zip(*dataset)]

# Rescale dataset columns to the range 0 - 1
def normalize_dataset(dataset, minmax):
    for row in dataset:
        for i in range(len(row)-1):
            row[i] = (row[i] - minmax[i][0]) / (minmax[i][1] - minmax[i][0])

# Split a dataset into k folds
def cross_validation_split(dataset, n_folds):
    dataset_split = list()
    dataset_copy = list(dataset)
    fold_size = int(len(dataset) / n_folds)
    for i in range(n_folds):
        fold = list()
        while len(fold) < fold_size:
            index = randrange(len(dataset_copy))
            fold.append(dataset_copy.pop(index))
        dataset_split.append(fold)
    return dataset_split

# Calculate accuracy percentage
def accuracy_metric(actual,predicted):
    correct = 0
    for i in range(len(actual)):
        if actual[i] == predicted[i]:
            correct += 1
        return correct / float(len(actual)) * 100.0
    
# Evaluate an algorithm using a cross validation split
def evaluate_algorithm(dataset, algorithm,n_folds,*args):
    folds = cross_validation_split(dataset,n_folds)
    scores = list()
    for fold in folds:
        train_set = list(folds)
        train_set.remove(fold)
        train_set = sum(train_set,[])
        test_set = list()
        for row in fold:
            row_copy = list(row)
            test_set.append(row_copy)
            row_copy[-1] = None
        predicted = algorithm(train_set, test_set, *args)
        actual = [row[-1] for row in fold]
        accuracy = accuracy_metric(actual,predicted)
        scores.append(accuracy)
    return scores

# Calculate neuron activation for an input
def activate(weights, inputs):
    activation = weights[-1]
    for i in range(len(weights)-1):
        activation += weights[i] * input[i]
    return activation

# Transfer neuron activation
def transfer(activation):
    return 1.0 / (1.0 + exp(-activation))

# Forward propagate input to a network output
def forward_propagate(network,row):
    inputs = row
    for layer in network:
        new_inputs = []
        for neuron in layer:
            activation = activate(neuron['weights'],inputs)
            neuron['output'] = transfer(activation)
            new_inputs.append(neuron['output'])
        inputs = new_inputs
    return inputs

# Backpropagate error and store in enurons
def backward_propagate_error(network, expected):
    for i in reversed(range(len(network))):
        layer = network[i]
        errors = list()
        if i != len(network)-1:
            for j in range(len(layer)):
                error = 0.0
                for neuron in network[i + 1]:
                    error += (neuron['weights'][j]*neuron['delta'])
                errors.append(error)
        else:
            for j in range(len(layer)):
                neuron = layer[j]
                errors.append(expected[j] - neuron['output'])
        for j in range(len(layer)):
            neuron = layer[j]
            neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])
            
# Update network weights with error
def update_weights(network,row,l_rate):
    for i in range(len(network)):
        inputs = row[:-1]
        if i != 0:
            inputs = [neuron['output'] for neuron in network[i-1]]
        for neuron in network[i]:
            for j in range(len(inputs)):
                neuron['weights'][j] += l_rate * enuron['delta'] * input[j]
                neuron['weights'][-1]+= l_rate * enuron['delta']
                
# Train a network for a fixed number of epochs
def train_network(network, train,l_rate,n_epoch,n_outputs):
    for i in range(n_epoch):
        for row in train:
            forward_propagate(network,row)
            expected = [0 for i in range(n_outputs)]
            expected[row[-1]] = 1
            backward_propagate_error(network, expected)
            update_weights(network,row,l_rate)
            
# Initialize a network
def initialize_network(n_inputs,n_hidden,n_outputs):
    network = list()
    hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in range(n_hidden)]
    network.append(hidden_layer)
    output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in range(n_outputs)]
    network.append(output_layer)
    return network

# Make a prediction with a network
def predict(network, row):
    outputs = forward_propagate(network,row)
    return outputs.index(max(outputs))

# Backpropagation Algorithm with Stochastic Gradient Descent
def back_propagation(train,test,l_rate,n_epoch,n_hidden):
    n_inputs = len(train[0]) -1
    n_outputs = len(set([row[-1] for row in train]))
    network = initialize_network(n_inputs,n_hidden,n_outputs)
    train_network(network, train, l_rate, n_epoch, n_outputs)
    predictions = list()
    for row in test:
        prediction = predict(network,row)
        predictions.append(prediction)
    return (predictions)

# Test Backprop on Seed dataset
seed(1)
# load and prepare data
filename = 'seeds_dataset.csv'
dataset = load_csv(filename)
for i in range(len(dataset[0])-1):
    str_column_to_float(dataset,i)
# convert class column to integers
str_column_to_int(dataset,len(dataset[0])-1)
# normalize input variances
minmax = dataset_minmax(dataset)
normalize_dataset(dataset,minmax)
# evaluate algorithm
n_folds = 5
l_rate = 0.3
n_epoch = 500
n_hidden = 5
scores = evaluate_algorithm(dataset, back_propagation, n_folds, l_rate, n_epoch, n_hidden)
print('Scores: %s' % scores)
print('Mean Accuracy: %.3f%%' % (sum(scores)/float(len(scores))))

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
误差反向传播(Error Backpropagation)是一种用于训练神经网络的优化算,常用于解决监督学习问题。它通过计算预测值与真实值之间的误差,并将误差反向传播回网络中的每个权重,从而调整网络参数以最小化误差。 下面是基本的误差反向传播的步骤: 1. 步骤一:前向传播 - 输入一个训练样本,并通过网络前向传播计算得到预测值。 - 预测值与真实值之间的误差可以使用不同的损失函数来衡量,如均方误差(mean squared error)或交叉熵损失(cross-entropy loss)。 2. 步骤二:反向传播 - 计算输出层的误差梯度,即预测值与真实值之间的差异。 - 将误差梯度反向传播回每一层隐藏层,其中每个权重的贡献度根据其权重在网络中的路径来计算。 - 对于每个权重,通过链式则计算损失函数对该权重的偏导数。 3. 步骤三:参数更新 - 使用梯度下降或其他优化算,根据计算得到的梯度,更新网络中的参数(权重和偏置)。 - 更新参数的步长(学习率)可以进行调整,以控制更新的速度和稳定性。 4. 步骤四:重复迭代 - 重复执行步骤一至步骤三,通过多次迭代来逐渐减小误差,提高网络的性能。 - 在每次迭代中,可以随机选择不同的训练样本,或者使用小批量(mini-batch)的样本来进行更新。 通过误差反向传播神经网络可以逐步优化自身的参数,以更好地拟合训练数据,并在新的输入上做出准确的预测。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值