本文公式较多,由于简书不支持公式渲染,公式完整版请移步个人博客
import numpy as np
目标
使用numpy实现多层感知机的正向和反向传播
层次构建
全连接层
正向传播
正向传播的公式为:$Y = f(W \times X + b)$,其中,Y为输出,W为权值,b为偏置
反向传播
对于反向传播,已知上一层传回的梯度为dY,对应的反向传播公式为:
$$dX = (W^{T} \times dY) \cdot f'(Y)$$
$$dW = \cfrac{1}{m} dY \times X^{T}$$
$$db = \cfrac{1}{m} \sum dY$$
代码实现
class numpy_fc(object):
def __init__(self, in_channel, out_channel, optim):
self.weight = np.float64(np.random.randn(out_channel, in_channel) * 0.1)
self.bias = np.zeros((out_channel, 1),dtype=np.float64)
self.in_data = np.zeros((1, in_channel))
self.out_data = None
self.weight_grad = None
self.bias_grad = None
self.optimizer = optim
def forward(self, data):
self.in_data = data
self.out_data = np.dot(self.weight, data) + self.bias
return self.out_data
def backward(self, grad):
data_grad = np.dot(self.weight.T, grad)
self.weight_grad = np.dot(grad, self.in_data.T)
self.bias_grad = np.sum(grad, axis=1).reshape((-1,1))
return data_grad
def step(self):
# print(self.bias_grad.shape,self.bias.shape)
self.weight += self.optimizer(self.weight_grad)
self.bias += self.optimizer(self.bias_grad)
代码测试
test_fc = numpy_fc(16,8,None)
test_fc_forward = test_fc.forward(np.random.rand(16,10))
print(test_fc_forward.shape)
test_fc_back = test_fc.backward(test_fc_forward)
print(test_fc_back.shape)
print(test_fc.weight_grad.shape,test_fc.weight.shape)
print(test_fc.bias_grad.shape,test_fc.bias.shape)
(8, 10)
(16, 10)
(8, 16) (8, 16)
(8, 1) (8, 1)
激活函数
sigmoid函数
sigmoid函数是常用的二分类问题输出层激活函数,前向传播和反向传播分别如下所示:
$$ sigmoid(x) = \cfrac{1}{1 + e^{-x}}$$
$$ sigmoid'(x) = sigmoid(x) \cdot (1