【python从零开始学习深度学习02】只使用numpy搭建一个多层神经网络类

香港重疾险私信加微

于 2024-05-17 17:33:12 发布

阅读量435

点赞数 5

分类专栏：机器学习 Python 文章标签：深度学习 python 学习

本文链接：https://blog.csdn.net/qq_31600443/article/details/138954782

版权

Python 同时被 2 个专栏收录

17 篇文章 1 订阅

订阅专栏

机器学习

5 篇文章 0 订阅

订阅专栏

只使用numpy搭建一个多层神经网络类

自我介绍
神经网络类设计

自我介绍

最懂保险的算法工程师，致力于保险理念的推广，让每个程序员在35岁时都能够免除后顾之忧。通过构建保险组合，避免中年因病致穷，苦攒多年积蓄全部花费在医疗上，因此返贫。有兴趣的朋友后台私信，常驻深圳香港，可约面谈

神经网络类设计

在本节中，我们将使用面向对象的编程方法来构建一个简单的神经网络类，这个类将包括前向传播、损失计算、反向传播和参数更新等功能。这种设计将使得网络的结构更加清晰，代码更易于维护和扩展。

1. 类定义和初始化

首先，我们定义一个神经网络类 NeuralNetwork，并初始化网络的基本参数。

初始化的input

nn_architecture = [
        {"input_dim": 80, "output_dim": 16, "activation": ReLU(), "hidden":Linear()},
        {"input_dim": 16, "output_dim": 8, "activation": ReLU(), "hidden":Linear()},
        {"input_dim": 8, "output_dim": 2, "activation": Softmax(), "hidden":Linear()},
]

import numpy as np

class NeuralNetwork:
    def __init__(self, nn_architecture, X=None, y=None):
    	self.X = X
        self.y = y
        layer_size_list = []
        self.layers = []
        self.activations = []
        # Maybe in some layer, there are not activation or hidden, to solve this situation, better to do some extra processing
        self.nn_architecture = nn_architecture
        for architecture in nn_architecture: 
            print("architecture", architecture)
            layer_size_list.append(architecture['output_dim'])
            self.activations.append(architecture['activation'])
            self.layers.append(architecture['hidden'])
            self.dropout_layers.append(architecture['dropout_keep_prob'])
        # assert 
        
        layer_size_list = [nn_architecture[0]['input_dim']] + layer_size_list
        print(layer_size_list)
        self.layer_count = len(layer_size_list)
        assert self.layer_count-1 == len(self.activations) == len(self.layers), "their size should be the same"
        
        self.W = [0 for i in range(self.layer_count-1)]
        self.b = [0 for i in range(self.layer_count-1)]
        
        self.dW = [0 for i in range(self.layer_count-1)]
        self.db = [0 for i in range(self.layer_count-1)]
        self.Z = [0 for i in range(self.layer_count-1)]
        self.A = [0 for i in range(self.layer_count-1)]
        self.dA = [0 for i in range(self.layer_count-1)]
        self.dZ = [0 for i in range(self.layer_count-1)]
        
        for i in range(self.layer_count-1):
             
            self.W[i] = np.random.rand(layer_size_list[i], layer_size_list[i+1]) -0.5
            self.b[i] = np.random.rand(1, layer_size_list[i+1]) -0.5

2. 前向传播

在神经网络中，前向传播是计算和存储网络中每一层的激活值的过程。

    def activation_forward(self, function_name):
        def forward_func(x, alpha=0.01):
            if function_name == 'tanh':
                return np.tanh(x)
            elif function_name == 'sigmoid':
                return 1 / (1 + np.exp(-x))
            elif function_name == 'relu':
                return np.maximum(0, x).astype(np.float64)
            elif function_name == 'leaky_relu':
                return np.where(x > 0, x, alpha * x)
            elif function_name == 'softmax':
                exp_values = np.exp(x - np.max(x, axis=1, keepdims=True))
                return exp_values / np.sum(exp_values, axis=1, keepdims=True)
            else:
                raise ValueError("this type is not supported")
        return forward_func
        
	def layer_forward(self, function_name):
        def forward(x, w, b):
            if function_name == 'linear':
                return np.dot(x, w) + b
            else:
                raise ValueError("this type is not supported")
        return forward
        
	def activation_backward(self, function_name):
        def backward(x, alpha=0.01):
            if function_name == 'tanh':
                return 1 - np.tanh(x) ** 2  
            elif function_name == 'sigmoid':
                return np.exp(x)/((1 + np.exp(x))**2)
            elif function_name == 'relu':
                return np.array(x > 0).astype(np.float64)
            elif function_name == 'leaky_relu':
                return np.where(x > 0, 1, alpha)
            elif function_name == 'softmax':
                return 1
            elif function_name == 'linear':
                return 1
            else:
                raise ValueError("this type is not supported")
        return backward
        
   def forward(self, X=None):
        """
        checked

        Args:
            X (_type_): _description_

        Returns:
            _type_: _description_
        """
        try:
	        if X is None:
	            X = self.X
            self.Z[0] = self.layer_forward(self.layers[0])(X, self.W[0], self.b[0])
            self.A[0] = self.activation_forward(self.activations[0])(self.Z[0])
            
            for i,layer,activation in zip(range(1,self.layer_count-1), self.layers[1:], self.activations[1:]): # iter count: (n-1)-1
                self.Z[i] = self.layer_forward(layer)(self.A[i-1], self.W[i], self.b[i])
                self.A[i] = self.activation_forward(activation)(self.Z[i])
                
                
            return self.A[-1]
        except Exception as e:
            print("Error forward_modified", e)

3. 反向传播

反向传播是通过网络反向传递误差以更新权重和偏差的过程。

	def backward(self, X=None, y=None):
		if X is None:
            X = self.X
        if y is None:
            y = self.y
        print("X.shape[0]:",X.shape[0])
        num_rows = X.shape[0]
        self.dZ[-1] = self.entropy_loss("softmax",self.A[-1], y)
        for i,activation in zip(range(self.layer_count-1)[::-1][1:], self.activations[::-1][1:]): # start from last two after reversing
            self.dA[i] = self.dZ[i+1].dot(self.W[i+1].T) # 这里换了乘的顺序
            self.dZ[i] = self.dA[i] * activation.backward(self.Z[i])
        
        self.dW[0] = X.T.dot(self.dZ[0]) / num_rows
        self.db[0] = np.sum(self.dZ[0], axis=0, keepdims=True) / num_rows
        for i in range(1,self.layer_count-1): # consider layer_count equal to 2, do nothing here
            self.dW[i] = (self.A[i-1]).T.dot(self.dZ[i]) / num_rows # it seems like adding 1/m or not won't have too much influence
            self.db[i] = np.sum(self.dZ[i], axis=0, keepdims=True) / num_rows

4. 参数更新

参数更新是使用梯度下降算法根据反向传播计算得到的梯度来调整权重和偏差的过程。

    def parameter_update(self, learning_rate):
        for i in range(self.layer_count-1):
            self.W[i] -= self.dW[i] * learning_rate
            self.b[i] -= self.db[i] * learning_rate

5. 完整的模型训练过程

最后，我们可以通过调用这些方法来训练我们的神经网络模型。


	def vallina_train(self, epochs, learning_rate): # use all the data for one epoch
        for epoch in range(epochs):
            # Perform forward and backward pass for each epoch
            self.forward()
            self.backward()
            self.parameter_update(learning_rate)

6.定义一些查看模型效果的函数

def predict(self, X):
        """ Compute predictions with just a forward pass """
        self.forward(X)
        return np.round(self.A[-1]).astype(np.int64)
    
    def predict_prob(self, X):
        self.forward(X)
        return self.A[-1]
        
    def get_prediction_on_multi_class(self, X):
        self.forward(X)
        return np.argmax(self.A[-1], 1)
    
    def get_accuracy(self, X, Y):
        predictions = self.get_prediction_on_multi_class(X)
        accuracy = np.sum(predictions == Y) / Y.size
        print(f"The present accuarcy is {accuracy}")
        return accuracy

实践应用生成数据测试模型效果

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification

# Create synthetic data with three classes
X, y = make_classification(n_samples=300, n_features=9, n_informative=4, n_redundant=0, n_classes=2,
                           n_clusters_per_class=1, random_state=42)

# Plot the data points
plt.figure(figsize=(8, 6))
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Synthetic Data with Three Classes')
plt.show()

需要对y进行转换，由于使用的是softmax，我们需要将y转换为one-hot编码。我们这里建议使用

nn = NeuralNetwork(nn_architecture, X, y)

nn_architecture = [
        {"input_dim": 80, "output_dim": 16, "activation": 'relu', "hidden":'linear'},
        {"input_dim": 16, "output_dim": 8, "activation": 'relu', "hidden":'linear'},
        {"input_dim": 8, "output_dim": 2, "activation": 'softmax', "hidden":'linear'},
]

nn = NeuralNetwork(nn_architecture, X, y)