【CV基石】Forward Propagation 与 Backward Propagation（MSE简单版）

最新推荐文章于 2022-07-07 16:33:23 发布

iamrealAI

最新推荐文章于 2022-07-07 16:33:23 发布

阅读量964

点赞数

分类专栏：计算机视觉文章标签：计算机视觉机器学习深度学习神经网络人工智能

本文链接：https://blog.csdn.net/AaronYKing/article/details/90143450

版权

计算机视觉专栏收录该内容

14 篇文章 1 订阅

订阅专栏

推导示例

bp1
bp2
bp3
bp4

前向传播

计算每一层的特征值X以及输出层和真值的偏差Loss。示例中采用Sigmoid作为激活函数，MSE作为Loss函数。

反向传播

不断的使用链式法则，从后向前求得Loss对每一层的权重W、偏置b的导数，对W、b进行更新，然后重复前传和反传操作，直到Loss收敛。

实现代码

Python实现代码如下：

# -*- coding: UTF-8 -*-
import numpy as np
from matplotlib import pyplot as plt
from pylab import mpl

# 设置中文可以显示
mpl.rcParams['font.sans-serif']=['SimHei']

# 激活函数
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# 损失函数
def mse_loss(out, label):
    N = out.shape[0]
    loss = np.sum(np.square(label - out)) / N
    return loss

# 前向传播 
def forward(X, W1, b1, W2, b2):
    Z1 = np.dot(W1, X) + b1
    A1 = sigmoid(Z1)
    Z2 = np.dot(W2, A1) + b2
    A2 = sigmoid(Z2)
    return Z1, A1, Z2, A2

# 反向传播
def backward(Y, A2, A1, X):
    dZ2 = -(Y - A2) * A2 * (1 - A2)
    dW2 = np.dot(dZ2, A1.T)
    db2 = 1 / 2 * np.sum(dZ2)
    dZ1 = np.dot(W2.T, dZ2) * A1 * (1-A1)
    dW1 = np.dot(dZ1, X.T)
    db1 = 1 / 3 * np.sum(dZ1)
    return dW2, db2, dW1, db1

# 梯度下降
def gradient_descent(W1, b1, W2, b2, dW1, db1, dW2, db2, alpha):
    W1 = W1 - alpha * dW1
    b1 = b1 - alpha * db1
    W2 = W2 - alpha * dW2
    b2 = b2 - alpha * db2
    return W1, b1, W2, b2

# 神经网络训练
def NNTraining(X, W1, b1, W2, b2, Y, alpha, iterations):
    L = np.zeros((iterations, 1))
    for i in range(iterations):
        # 前向传播
        Z1, A1, Z2, A2 = forward(X, W1, b1, W2, b2)
        # 误差计算
        L[i] = mse_loss(Y, A2)
        # 反向传播
        dW2, db2, dW1, db1 = backward(Y, A2, A1, X)
        # 梯度下降
        W1, b1, W2, b2 = gradient_descent(W1, b1, W2, b2, dW1, db1, dW2, db2, alpha)
        print("iterations:", i, "loss:", L[i], "result:", A2[0], A2[1])
    return L
 
if __name__ == '__main__':
    # 输入
    X=np.array([[0.05],[0.10]])
    # 输出
    Y=np.array([[0.01],[0.99]])
    # 初始化权重和偏置
    W1=np.array([[0.15, 0.20], [0.25, 0.30], [0.05, 0.10]])
    b1=0.35  # 因为有广播机制，所以不用写成向量形式
    W2=np.array([[0.40, 0.45, 0.26], [0.50, 0.55, 0.13]])
    b2=0.6
    # 学习率
    alpha=0.5
    # 迭代次数
    iterations=10000

    L = NNTraining(X, W1, b1, W2, b2, Y, alpha, iterations)

    fig=plt.figure(1)
    plt.plot(L)
    plt.title(u'Loss曲线')
    plt.xlabel(u'iteration')
    plt.ylabel(u'loss')
    plt.show()

训练结果

iterations: 9990 loss: [2.14028324e-05] result: [0.01462654] [0.9853739]
iterations: 9991 loss: [2.13984463e-05] result: [0.01462606] [0.98537437]
iterations: 9992 loss: [2.13940613e-05] result: [0.01462559] [0.98537485]
iterations: 9993 loss: [2.13896775e-05] result: [0.01462512] [0.98537532]
iterations: 9994 loss: [2.13852949e-05] result: [0.01462464] [0.98537579]
iterations: 9995 loss: [2.13809135e-05] result: [0.01462417] [0.98537627]
iterations: 9996 loss: [2.13765332e-05] result: [0.01462369] [0.98537674]
iterations: 9997 loss: [2.13721542e-05] result: [0.01462322] [0.98537721]
iterations: 9998 loss: [2.13677763e-05] result: [0.01462275] [0.98537769]
iterations: 9999 loss: [2.13633995e-05] result: [0.01462227] [0.98537816]