深度学习：基于python:第五章，小结（2）

本文链接：https://blog.csdn.net/weixin_44953928/article/details/121618703

第五章误差反向传播

数值微分虽然简单，也容易实现，但缺点是计算上比较费时间。本章我们将学习一个能够高效计算权重参数的梯度的方法——误差反向传播法。

5.1 计算图

5.1.1　用计算图求解

5.1.2　局部计算

5.1.3　为何用计算图解题

5.2 链式法则

5.2.1 计算图的反向传播

5.2.2　什么是链式法则

5.3 反向传播

5.3.1　加法节点的反向传播

5.3.2　乘法节点的反向传播

5.4 简单层的实现

5.4.1　乘法层的实现

类函数

class MulLayer:
     def __init__(self):
         self.x = None
         self.y = None
            
     def forward(self, x, y):
         self.x = x
         self.y = y
         out = x * y
         return out
    
     def backward(self, dout):
         dx = dout * self.y # 翻转x和y
         dy = dout * self.x
         return dx, dy

实现

apple = 100
apple_num = 2
tax = 1.1

# layer
mul_apple_layer = MulLayer() #记录苹果的价格*个数，还有此时dout
mul_tax_layer = MulLayer() #记录总价*消费税，还有此时dout

# forward
apple_price = mul_apple_layer.forward(apple, apple_num)
price = mul_tax_layer.forward(apple_price, tax)

print(price) # 220

# backward
dprice = 1
dapple_price, dtax = mul_tax_layer.backward(dprice)
dapple, dapple_num = mul_apple_layer.backward(dapple_price)

print(dapple, dapple_num, dtax) # 2.2 110 200

5.4.2　加法层的实现

类函数

class AddLayer:
     def __init__(self):
     	pass
    
     def forward(self, x, y):
         out = x + y
         return out
        
     def backward(self, dout):
         dx = dout * 1
         dy = dout * 1
         return dx, dy

实现

apple = 100  #苹果价格
apple_num = 2	
orange = 150
orange_num = 3
tax = 1.1		#消费税

# layer
mul_apple_layer = MulLayer()
mul_orange_layer = MulLayer()
add_apple_orange_layer = AddLayer()
mul_tax_layer = MulLayer()

# forward
apple_price = mul_apple_layer.forward(apple, apple_num) #(1)
orange_price = mul_orange_layer.forward(orange, orange_num) #(2)
all_price = add_apple_orange_layer.forward(apple_price, orange_price) #(3)
price = mul_tax_layer.forward(all_price, tax) #(4)

# backward
dprice = 1
dall_price, dtax = mul_tax_layer.backward(dprice) #(4)
dapple_price, dorange_price = add_apple_orange_layer.backward(dall_price) #(3)
dorange, dorange_num = mul_orange_layer.backward(dorange_price) #(2)
dapple, dapple_num = mul_apple_layer.backward(dapple_price) #(1)
print(price) # 715
print(dapple_num, dapple, dorange, dorange_num, dtax) # 110 2.2 3.3 165 650

5.5 激活函数层的实现

5.5.1　ReLU层

类函数

class Relu:
     def __init__(self):
     	self.mask = None
        
     def forward(self, x):
         self.mask = (x <= 0) #这里把小于等于0的数True
         out = x.copy()
         out[self.mask] = 0		#把小于等于0的全赋值0
         return out
    
     def backward(self, dout):
         dout[self.mask] = 0
         dx = dout
         return dx

解释类函数

>>> x = np.array( [[1.0, -0.5], [-2.0, 3.0]] )
>>> print(x)
[[ 1. -0.5]
 [-2. 3. ]]
>>> mask = (x <= 0)
>>> print(mask)
[[False True]
 [ True False]]

out = x.copy()
print(out)
[[ 1.  -0.5]
 [-2.   3. ]]

out[mask] = 0
print(out)
[[1. 0.]
 [0. 3.]]

5.5.2　Sigmoid层

在这里插入图片描述

用Python实现Sigmoid层

class Sigmoid:
     def __init__(self):
    	 self.out = None
        
     def forward(self, x):
         out = 1 / (1 + np.exp(-x))
         self.out = out
         return out
        
     def backward(self, dout):
         dx = dout * (1.0 - self.out) * self.out
         return dx

5.6 Affine/Softmax层的实现

5.6.1　Affine层

在这里插入图片描述

5.6.2　批版本的Affine层

在这里插入图片描述

类函数(这里不是官方的)

class Affine:
     def __init__(self, W, b):
         self.W = W
         self.b = b
         self.x = None
         self.dW = None
         self.db = None
        
     def forward(self, x):
         self.x = x
         out = np.dot(x, self.W) + self.b
         return out
        
     def backward(self, dout):
         dx = np.dot(dout, self.W.T)
         self.dW = np.dot(self.x.T, dout)
         self.db = np.sum(dout, axis=0)
         return dx

类函数(考虑到4维)

class Affine:
    def __init__(self, W, b):
        self.W =W
        self.b = b
        
        self.x = None
        self.original_x_shape = None
        # 权重和偏置参数的导数
        self.dW = None
        self.db = None

    def forward(self, x):
        # 对应张量
        self.original_x_shape = x.shape
        x = x.reshape(x.shape[0], -1)
        self.x = x

        out = np.dot(self.x, self.W) + self.b

        return out

    def backward(self, dout):
        dx = np.dot(dout, self.W.T)
        self.dW = np.dot(self.x.T, dout)
        self.db = np.sum(dout, axis=0)
        
        dx = dx.reshape(*self.original_x_shape)  # 还原输入数据的形状（对应张量）
        return dx

5.6.3　Softmax-with-Loss 层

在这里插入图片描述

def softmax(a):
	 exp_a = np.exp(a)
	 sum_exp_a = np.sum(exp_a)
	 y = exp_a / sum_exp_a
	 return y

#交叉熵误差
def cross_entropy_error(y, t):
#这是 为了防止y=([1,2,3])这是个一维，但是y.shape是（3，）
 if y.ndim == 1:		
     t = t.reshape(1, t.size)
     y = y.reshape(1, y.size)
 batch_size = y.shape[0]  #代表y总共有多少行
 return -np.sum(t * np.log(y + 1e-7)) / batch_size



class SoftmaxWithLoss:
     def __init__(self):
         self.loss = None # 损失
         self.y = None # softmax的输出
         self.t = None # 监督数据（one-hot vector）
        
     def forward(self, x, t):
         self.t = t
         self.y = softmax(x)
         self.loss = cross_entropy_error(self.y, self.t)	#交叉熵误差
         return self.loss
    
     def backward(self, dout=1):
         batch_size = self.t.shape[0]
         dx = (self.y - self.t) / batch_size
         return dx