The Perceptron algortihm is the simplest type of artificial neural network. It is a model of a single neuron that can be used for two-class classification problems and provides the foundation for later developing much larger networks.
After completing this tutorial, you will know:
- How to train the network weights for the Perceptron
- How to make predictions with the Perceptron
- How to implement the Perceptron algorithm for a real-world classification problem.
1.1 Description
This section provides a brief introduction to the Perceptron algorithm and the Sonar dataset to which we will later apply it.
1.1.1 Perceptron Algorithm
The Perceptron is inspired by the information processing of a single neural cell called a neuron.
The Perceptron receives input singals from examples of training data that we weight and combined in a linear equation called the activation.
The activation is then transformed into an output value or prediction using a transfer function, such as the step transfer function.
In this way, the Perceptron is a classification algorithm for problems with two classes (0 and 1) where a linear equation (like a line or hyperplane) can be used to separate the two classes. It is closely related to linear regression and logistic regression that make predictions in a similar way (e.g. a weighted sum of inputs). The weights of the Perceptron algorithm must be estimated from your training data using stochastic gradient descent.
1.1.2 Stochastic Gradient Descent
The Perceptron algorithm uses gradient descent to update the weights.Each iteration of gradient descent , the weights(w) are updated using the equation:
w is weight being optimized
learning rate is a learning rate that you must configure
(expected - predicted) is the prediction error for the model on the training data attributed to the weight
x is the input value.
1.2 Tutorial
This tutorial is broken down into 3 parts:
1.Making Predictions
2. Traning Network Weights
3.Sonar Case Study
1.2.1 Make Predictions
- develop a function that can make predictions
Below is a funtion named predict() that predicts an output value for a row given a set of weights. The first weight is always the bias as it is standalone and not responsible for a specific input value.
# Make a prediction with weights
def predict(row, weights):
activation = weights[0]
for i in range(len(dataset[0])):
activation += weights[i+1] * row[i]
return 1.0 if activation >= 0.0 else 0.0
We can contrive a small dataset to test our prediction function.
X1 X2 Y
2.7810836 2.550537003 0
1.465489372 2.362125076 0
3.396561688 4.400293529 0
1.38807019 1.850220317 0
3.06407232 3.005305973 0
7.627531214 2.759262235 1
5.332441248 2.088626775 1
6.922596716 1.77106367 1
8.675418651 -0.242068655 1
7.673756466 3.508563011 1
Below is a plot of the dataset using different colors to show the different classes for each point.

we can also use previously prepared weights to make predictions for this dataset.Putting this all together we can test our predict() function below.
# Example of making predictions
# Make a prediction with weights
def predict(row,weights):
activation = weights[0]
for i in range(len(row)-1):
activation += weights[i + 1] * row[i]
return 1.0 if activation >= 0.0 else 0.0
# test prediction
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
weights = [-0.1, 0.20653640140000007, -0.23418117710000003]
for row in dataset:
prediction = predict(row, weights)
print("Expected=%d, Predicted=%d" % (row[-1], prediction))

Now we are ready to implement stochastic gradient descent to optimize our weight values.
1.2.2 Training Network Weights
We can estimate the weight values for our training data using stochastic gradient descent. Stochastic gradient descent requires two parameters:
- Learning Rate: Used to limit the amount each weight is corrected each time it is updated.
- Epochs : The number of times to run through the training data while updating the weight.
These,along with the training data will be the arguments to the function. There are 3 loops we need to perform in the function:
- 1.Loop over epoch
- 2.Loop over each row in the training data for an epoch
- 3. Loop over each weight and update it for a row in an epoch.

# Estimate Perceptron weights using stochastic gradient descent
def train_weights(train, l_rate, n_epoch):
weights = [0.0 for i in range(len(train[0]))]
for epoch in range(n_epoch):
sum_error = 0.0
for row in train:
prediction = predict(row, weights)
error = row[-1] - prediction
sum_error += error**2
weights[0] = weights[0] + l_rate * error
for i in range(len(row)-1):
weights[i + 1] = weights[i + 1] + l_rate * error * row[i]
print('>epoch=%d,lrate=%.3f,error=%.3f' %(epoch,l_rate, sum_error))
return weights
You can see that we also keep track of the sum of the squared error (a positive value) each epoch so that we can print out a nice message each outer loop. We can test this function on the same small contrived dataset from above.
# Example of training weights
# Make a prediction with weights
def predict(row, weights):
activation = weights[0]
for i in range(len(row)-1):
activation += weights[i + 1] * row[i]
return 1.0 if activation >= 0.0 else 0.0
# Estimate Perceptron weights using stochastic gradient descent
def train_weights(train, l_rate, n_epoch):
weights = [0.0 for i in range(len(train[0]))]
for epoch in range(n_epoch):
sum_error = 0.0
for row in train:
prediction = predict(row, weights)
error = row[-1] - prediction
sum_error += error**2
weights[0] = weights[0] + l_rate * error
for i in range(len(row)-1):
weights[i + 1] = weights[i + 1] + l_rate * error * row[i]
print('>epoch=%d,lrate=%.3f,error=%.3f' % (epoch,l_rate,sum_error))
return weights
# Calculate weights
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
l_rate = 0.1
n_epoch = 5
weights = train_weights(dataset, l_rate, n_epoch)
print(weights)
We use a learning rate of 0.1 and train the model for only 5 epochs, or 5 exposures of the weights to the entire training dataset. Running the example prints a message each epoch with the sum squared error for that epoch and the final set of weights.

We will use the predict() and train weights() functions created above to train the model and a new perceptron() function to tie them together. Below is the complete example.
# Perceptron Algorithm on the Sonar Dataset
from random import seed
from random import randrange
from csv import reader
# load a CSV file
def load_csv(filename):
dataset = list()
with open(filename,'r') as file:
csv_reader = reader(file)
for row in csv_reader:
if not row:
continue
dataset.append(row)
return dataset
# Convert string column to float
def str_column_to_float(dataset, column):
for row in dataset:
row[column] = float(row[column].strip())
# Convert string column to integer
def str_column_to_int(dataset, column):
class_values = [row[column] for row in dataset]
unique = set(class_values)
lookup = dict()
for i , value in enumerate(unique):
lookup[value] = i
for row in dataset:
row[column] = lookup[row[column]]
return lookup
# Split a dataset into k folds
def cross_validation_split(dataset, n_folds):
dataset_split = list()
dataset_copy = list(dataset)
fold_size = int(len(dataset) / n_folds)
for i in range(n_folds):
fold = list()
while len(fold) < fold_size:
index = randrange(len(dataset_copy))
fold.append(dataset_copy.pop(index))
dataset_split.append(fold)
return dataset_split
# Calculate accuracy percentage
def accuracy_metric(actual,predicted):
correct = 0
for i in range(len(actual)):
if actual[i] == predicted[i]:
correct += 1
return correct / float(len(actual)) * 100.0
# Evaluate an algorithm using a cross validation split
def evaluate_algorithm(dataset, algorithm,n_folds,*args):
folds = cross_validation_split(dataset,n_folds)
scores = list()
for fold in folds:
train_set = list(folds)
train_set.remove(fold)
train_set = sum(train_set,[])
test_set = list()
for row in fold:
row_copy = list(row)
test_set.append(row_copy)
row_copy[-1] = None
predicted = algorithm(train_set, test_set,*args)
actual = [row[-1] for row in fold]
accuracy = accuracy_metric(actual,predicted)
scores.append(accuracy)
return scores
# Make a prediction with weights
def predict(row, weights):
activation = weights[0]
for i in range(len(row)-1):
activation += weights[i + 1] * row[i]
return 1.0 if activation >= 0.0 else 0.0
# Estimate Perceptron weights using stochastic gradient descent
def train_weights(train, l_rate, n_epoch):
weights = [0.0 for i in range(len(train[0]))]
for i in range(n_epoch):
for row in train:
prediction = predict(row, weights)
error = row[-1] - prediction
weights[0] = weights[0] + l_rate * error
for i in range(len(row)-1):
weights[i+1] = weights[i + 1] + l_rate * error * row[i]
return weights
# Perceptron Algorithm with Stochastic Gradient Descent
def perceptron(train, test, l_rate, n_epoch):
predictions = list()
weights = train_weights(train, l_rate, n_epoch)
for row in test:
prediction = predict(row,weights)
predictions.append(prediction)
return (predictions)
# Test the Perceptron algorithm on the sonar dataset
seed(1)
# load and prepare data
filename = 'sonar.all-data.csv'
dataset = load_csv(filename)
for i in range(len(dataset[0])-1):
str_column_to_float(dataset,i)
# convert string class to integers
str_column_to_int(dataset, len(dataset[0])-1)
# evaluate algortithm
n_folds = 3
l_rate = 0.01
n_epoch = 500
scores = evaluate_algorithm(dataset, perceptron,n_folds, l_rate, n_epoch)
print('Score: %s' % scores)
print('Mean Accuracy: %.3f%%' % (sum(scores)/float(len(scores))))
A k value of 3 was used for cross-validation, giving each fold 208 3 = 69.3 or just under 70 records to be evaluated upon each iteration. A learning rate of 0.1 and 500 training epochs were chosen with a little experimentation. You can try your own configurations and see if you can beat my score. Running this example prints the scores for each of the 3 cross-validation folds then prints the mean classification accuracy. We can see that the accuracy is about 73%, higher than the baseline value of just over 50%.

1698

被折叠的 条评论
为什么被折叠?



