numpy手写BP神经网络——分类问题

最新推荐文章于 2024-05-27 20:16:30 发布

进击的墨菲特

最新推荐文章于 2024-05-27 20:16:30 发布

阅读量1.5k

点赞数 34

文章标签： numpy 神经网络分类 pandas 机器学习

本文链接：https://blog.csdn.net/x_8efengfan/article/details/135375786

版权

前言

善始者繁多，克终者盖寡。

一、问题描述

在numpy手写BP神经网络中我们构建了一个形状为“5 *10 *10 * 5”的BP神经网络，该模型包含2个隐含层，并且使用“独热编码”方式实现了多分类，但是分类效果不佳，其预测准确率仅为27%。

1.1 模型预测准确率不高的原因

资源文件已上传，可下载使用。
我们希望通过“ABCDE”五项指标测量被试的编程能力，对数据进行归一化后，通过熵权法、离差最大化法求出各项指标的权重和，使用组合赋权法得到了每位被试的综合评分，现依据被试的综合评分为其编程能力划分等级。等级描述如图所示：
在这里插入图片描述
在numpy手写BP神经网络中我们构建的BP神经网络如图所示，我们希望输入5项指标的数据后直接返回被试的编程能力等级，但是，我们构建的51010*5的预测模型有误，它将连续性的数据转换为了离散型数据。
在这里插入图片描述

以等级“弱”为例，当被试综合得分在0-0.2之间时，我们认为被试的编程能力弱，但是在上述510105的预测模型中，被试输出为[1,0,0,0,0]时才被认为是编程能力弱；
以等级“一般”为例，当被试综合得分在0.4-0.6之间时，我们认为被试的编程能力一般，但是在上述510105的预测模型中，被试输出为[0,0,1,0,0]时才被认为是编程能力一般。

综上所述，模型预测效果差的原因是：BP神经网络模型构建的不对！！！

1.2 解决方案

对于连续型数据的分类问题，BP神经网络的输出层只需1个神经元即可，先由BP神经网络输出被试的综合得分再对其进行分类。
也就是说我们将连续型数据的分类问题转换为了回归+分类的问题。在1.1神经网络模型的基础上，删减输出层神经元的个数，构建如图所示的神经网络模型：
在这里插入图片描述

二、python代码

2.1 BP神经网络工作流程

BP神经神经网络工作时主要有四个步骤，详细信息参照numpy手写BP神经网络。

前向传播-》计算误差-》后向传播-》-更新权重

2.2 初始化参数

    '''
        input,hidden,output分别表示输入层、隐含层、输出层神经元的个数
    '''
    def __init__(self,input,hidden,output):
        self.weight1 = numpy.random.randn(input,hidden)
        self.weight2 = numpy.random.randn(hidden,hidden)
        self.weight3 = numpy.random.randn(hidden,output)
        #准确度，训练后预测正确数目与样本总数之比
        self.accuracy = []
        #精确度，对训练结果而言，模型正确预测某一类别的样本数与模型预测为该类的样本数之比
        self.precision = []
        #召回率，对原始样本而言，样本中某个类别有多少被正确预测了
        self.recall = []
        #损失值
        self.loss = []

2.3 前向传播

2.3.1 激活函数-sigmod

在隐含层和输出层均使用sigmod激活函数。sigmod函数用于前向传播，公式为：
在这里插入图片描述

    #sigmod激活函数
    def sigmod(self,x):
        return 1/(1 + numpy.exp(-x))

2.3.2 前向传播代码

注意：在前向传播中隐含层、输出层均使用sigmod激活函数！！！

    #前向传播
    def forward(self, data):
        #存储每一层的输入和输出
        self.hidden1_input = numpy.dot(data, self.weight1)
        self.hidden1_output = self.sigmod(self.hidden1_input)

        self.hidden2_input = numpy.dot(self.hidden1_output,self.weight2)
        self.hidden2_output = self.sigmod(self.hidden2_input)

        self.output_input = numpy.dot(self.hidden2_output,self.weight3)
        self.output_output = self.sigmod(self.output_input)
        return self.output_output

2.4 反向传播（最重要步骤）

2.4.1 激活函数sigmod导数

反向传播包含了2.1中“计算误差-》后向传播-》-更新权重”三个操作，sigmod函数导数用于后向传播，公式为：

在这里插入图片描述

    #sigmod函数的导数
    def sigmod_derivative(self, x):
        return x * (1 - x)

2.4.2 损失函数-方差

损失函数反映了模型实际输出值与真实值之间的差异，根据经验，使用方差作为1.2中BO神经网络的损失函数，方差公式为：
在这里插入图片描述

y表示数据中某记录的真实值（标签）；
p表示模型对某记录的输出值（实际值/预测值）。

    #使用均方差作为损失函数
    def loss_mse(self,x,y):
        return 1/2*numpy.sum((x-y)*(x-y))

2.4.3 反向传播代码

    #后向传播
    def backward(self, data, label, learning_ration):
        #首先计算误差(损失)，交叉熵的导函数
        output_error = self.output_output - label
        #输出层误差项（包含了误差、激活函数导数两部分信息）
        output_delta = output_error * self.sigmod_derivative(self.output_output)
        #将输出层的误差传入隐藏层2
        hidden2_error = numpy.dot(output_delta,self.weight3.T) * self.sigmod_derivative(self.hidden2_output)
        #将隐藏层2的误差传入隐藏层1
        hidden1_error = numpy.dot(hidden2_error,self.weight2.T) * self.sigmod_derivative(self.hidden1_output)

        #三层误差已经得出，可以开始更新权重了
        self.weight1 -= numpy.dot(data.T,hidden1_error) * learning_ration
        self.weight2 -= numpy.dot(self.hidden1_output.T, hidden2_error) * learning_ration
        self.weight3 -= numpy.dot(self.hidden2_output.T, output_error) * learning_ration

2.5 训练模型

训练模型实际上就是重复执行前向传播、后向传播，以获取最优的权重值（此模型中未引入偏置）。在每执行一次“前向传播+后向传播”的同时，记录下此时模型的损失值（通过损失函数求得）和预测准确率。

    #训练数据集
    def train(self,data,label,learning_ration,epoch):
        for i in range(epoch):
            output = self.forward(data)
            self.backward(data, label, learning_ration)

            loss = self.loss_mse(label,output)
            # loss = self.loss_cross_entropy(label, output)
            self.loss.append(loss)

            accuary = self.caculate_accuracy_primal(output,label)
            self.accuracy.append(accuary)
            # print("accuary:",accuary)
            # self.show_weights()

2.6 预测结果

训练完毕后的模型各参数已经确定，“预测”就是执行一次前向传播。

    def predict(self,data):
        return self.forward(data)

2.7 损失曲线与准确率曲线

损失值与准确率均在2.4反向传播中计算出来，使用matplotlib绘制图像即可。

    def caculate_accuracy_primal(self,actual_label,label):
        actual_label = actual_label.tolist()
        label = label.tolist()
        true_count = 0
        size = len(label)
        for i in range(size):
            al = float(actual_label[i][0])
            l = float(label[i][0])
            # print(f"al is {al} l is {l}")
            if al>=0.7 and l>=0.7:
                true_count+=1
            if al>=0.6 and al<0.7 and l>=0.6 and l<0.7:
                true_count+=1
            if al>=0.4 and al<0.6 and l>=0.4 and l<0.6:
                true_count+=1
            if al>=0.2 and al<0.4 and l>=0.2 and l<0.4:
                true_count+=1
            if al>=0.0 and al<0.2 and l>=0.0 and l<0.2:
                true_count+=1
            # print(f"正确个数为{true_count},总个数为{size}")
        return true_count / size

    def show_loss(self):
        # print(self.loss)
        pyplot.title("LOSS")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.loss)
        pyplot.show()

    def show_accuracy(self):
        # print(self.loss)
        pyplot.title("Accuaracy")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.accuracy)
        pyplot.show()

注意：上述代码准确率依据1.1中编程能力划分等级比较模型输出值（实际值）和真实值（标签）得出，具体问题具体分析！！！
在这里插入图片描述

三、程序测试

3.1 加载数据

数据集中前10条记录如图所示：
在这里插入图片描述

def load_data_primal():
    df = pandas.read_excel("data2.xlsx")
    data_temp  = df[["A","B","C","D","E"]]
    label_temp = df["最终得分"]
    data = []
    label = []
    for i in range(df.shape[0]):
        data.append(data_temp.iloc[i].to_list())
        temp = []
        temp.append(label_temp[i])
        label.append(temp)
    data = numpy.array(data)
    label = numpy.array(label)
    return data,label

我们可以更加直观的看看表格数据在python中的表现形式，下图为前10记录的输入，使用二维数组存储。
在这里插入图片描述下图为前10条记录的期望输出，同样使用二维数组存储。

3.2 划分数据集并启动模型

数据集的划分参照numpy手写BP神经网络，本例中样本较少，将前75%作为训练集用于训练，将后25%作为测试集用于验证。

def mytest():
    data, label = load_data_primal()
    # print(data[0:10],"\n",label[0:10])
    # 划分训练集与测试集
    data_train = data[0:int(len(data) * 3 / 4)]
    data_train_train = data_train[0:int(len(data_train) * 3 / 4)]
    data_train_test = data_train[int(len(data_train) * 1 / 4) * (-1):-1]
    data_test = data[int(len(data) * 1 / 4) * (-1):-1]

    label_train = label[0:int(len(label) * 3 / 4)]
    label_train_train = label_train[0:int(len(data_train) * 3 / 4)]
    label_train_test = label_train[int(len(data_train) * 1 / 4) * (-1):-1]
    label_test = label[int(len(label) * 1 / 4) * (-1):-1]

    # 创建一个包含两个隐含层的BP神经网络
    network = BPNet_one_output(5, 10, 1)
    # 训练模型
    network.train(data_train, label_train, 0.01, 10000)
    network.show_loss()
    network.show_accuracy()
    # 预测结果
    result = network.predict(data_test)
    print(result)
    print(label_test)
    acc = network.caculate_accuracy_primal(result,label_test)
    print("准确率是{:.2f}%".format(acc*100))

3.3 模型效果分析

3.3.1 损失曲线

随着训练的进行，损失函数逐渐减小。
在这里插入图片描述

3.3.2 准确率曲线

随着训练的进行，模型在训练集上的预测准确率逐渐提高，最终接近100%。
在这里插入图片描述

3.3.3 预测准确率

该模型在验证集上的预测准确率为90.91%，当然，因为初始权重是随机设置的，多次执行得到的结果不一定相同。
在这里插入图片描述

四、完整代码

import numpy
import pandas
from matplotlib import pyplot
'''
    构建一个包含两个隐含层的BP神经网络
'''
class BPNet_one_output:
    '''
        input,hidden,output分别表示输入层、隐含层、输出层神经元的个数
    '''
    def __init__(self,input,hidden,output):
        self.weight1 = numpy.random.randn(input,hidden)
        self.weight2 = numpy.random.randn(hidden,hidden)
        self.weight3 = numpy.random.randn(hidden,output)
        #准确度，训练后预测正确数目与样本总数之比
        self.accuracy = []
        #精确度，对训练结果而言，模型正确预测某一类别的样本数与模型预测为该类的样本数之比
        self.precision = []
        #召回率，对原始样本而言，样本中某个类别有多少被正确预测了
        self.recall = []
        #损失值
        self.loss = []

    #比较两个列表是否相同
    def compare(self,list1:list,list2:list):
        if len(list1)!= len(list2):
            return
        for i in range(len(list1)):
            if list1[i]!=list2[i]:
                return 0
        return 1

    def caculate_accuracy(self,actual_label,label):
        true_count = 0
        false_count = 0
        result = []
        for i in range(len(actual_label)):
            # 将numpy.ndarray转换为普通的List
            temp = self.one_hot_encoding(actual_label[i])
            result.append(temp)
        actual_label = result
        size = len(label)
        for i in range(size):
            rs = self.compare(actual_label[i],label[i])
            if rs==1:
                true_count += 1
            else:
                false_count += 1
        # print(f"正确个数为{true_count},总个数为{size}")
        return true_count/size

    def caculate_accuracy_primal(self,actual_label,label):
        actual_label = actual_label.tolist()
        label = label.tolist()
        true_count = 0
        size = len(label)
        for i in range(size):
            al = float(actual_label[i][0])
            l = float(label[i][0])
            # print(f"al is {al} l is {l}")
            if al>=0.7 and l>=0.7:
                true_count+=1
            if al>=0.6 and al<0.7 and l>=0.6 and l<0.7:
                true_count+=1
            if al>=0.4 and al<0.6 and l>=0.4 and l<0.6:
                true_count+=1
            if al>=0.2 and al<0.4 and l>=0.2 and l<0.4:
                true_count+=1
            if al>=0.0 and al<0.2 and l>=0.0 and l<0.2:
                true_count+=1
            # print(f"正确个数为{true_count},总个数为{size}")
        return true_count / size

    def one_hot_encoding(self,data:list):
        max = data[0]
        max_index = 0
        for i in range(len(data)):
            if data[i]>max:
                max = data[i]
                max_index = i
        for i in range(len(data)):
            if i==max_index:
                data[i]=1
            else:
                data[i]=0
        return data

    #sigmod激活函数
    def sigmod(self,x):
        return 1/(1 + numpy.exp(-x))

    #sigmod函数的导数
    def sigmod_derivative(self, x):
        return x * (1 - x)

    #softmax激活函数
    def softmax(self,x):
        #按行计算每一个样本
        exps = numpy.exp(x - numpy.max(x,axis=1,keepdims=True))
        #为避免指数溢出numpy能够表示的上限，使其减去当前数据中的最大值
        return exps/numpy.sum(exps,axis=1,keepdims=True)

    def loss_cross_entropy(self,y,p):
        '''
        :param y: 真实标签
        :param p: 预测标签
        :return: 交叉熵
        '''
        #为了避免出现log(0)的情况，计算时加上一个极小值
        min_data = 1e-60
        # return -1 * numpy.sum(y*numpy.log(p+min_data))
        return -numpy.mean(y*numpy.log(p+min_data))

    def loss_cross_entropy_derivative(self,label_true,label_predict):
        return label_true - label_predict

    #使用均方差作为损失函数
    def loss_mse(self,x,y):
        return 1/2*numpy.sum((x-y)*(x-y))

    #前向传播
    def forward(self, data):
        #存储每一层的输入和输出
        self.hidden1_input = numpy.dot(data, self.weight1)
        self.hidden1_output = self.sigmod(self.hidden1_input)

        self.hidden2_input = numpy.dot(self.hidden1_output,self.weight2)
        self.hidden2_output = self.sigmod(self.hidden2_input)

        self.output_input = numpy.dot(self.hidden2_output,self.weight3)
        self.output_output = self.sigmod(self.output_input)
        return self.output_output

    #后向传播
    def backward(self, data, label, learning_ration):
        #首先计算误差(损失)，交叉熵的导函数
        output_error = self.output_output - label
        #输出层误差项（包含了误差、激活函数导数两部分信息）
        output_delta = output_error * self.sigmod_derivative(self.output_output)
        #将输出层的误差传入隐藏层2
        hidden2_error = numpy.dot(output_delta,self.weight3.T) * self.sigmod_derivative(self.hidden2_output)
        #将隐藏层2的误差传入隐藏层1
        hidden1_error = numpy.dot(hidden2_error,self.weight2.T) * self.sigmod_derivative(self.hidden1_output)

        #三层误差已经得出，可以开始更新权重了
        self.weight1 -= numpy.dot(data.T,hidden1_error) * learning_ration
        self.weight2 -= numpy.dot(self.hidden1_output.T, hidden2_error) * learning_ration
        self.weight3 -= numpy.dot(self.hidden2_output.T, output_error) * learning_ration

    #训练数据集
    def train(self,data,label,learning_ration,epoch):
        for i in range(epoch):
            output = self.forward(data)
            self.backward(data, label, learning_ration)

            loss = self.loss_mse(label, output)
            # loss = self.loss_cross_entropy(label, output)
            self.loss.append(loss)
            accuary = self.caculate_accuracy_primal(output, label)
            self.accuracy.append(accuary)
            # print("accuary:",accuary)
            # self.show_weights()

    #使用训练好的数据预测结果
    def predict_ont_hot(self,data:list):
        data = self.predict(data)
        result = []
        for i in range(len(data)):
            #将numpy.ndarray转换为普通的List
            temp = self.one_hot_encoding(data[i].tolist())
            result.append(temp)
        return result

    def predict(self,data):
        return self.forward(data)

    def show_weights(self):
        print(f"{self.weight1}\n{self.weight2}\n{self.weight3}")

    def show_loss(self):
        # print(self.loss)
        pyplot.title("LOSS")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.loss)
        pyplot.show()

    def show_accuracy(self):
        # print(self.loss)
        pyplot.title("Accuaracy")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.accuracy)
        pyplot.show()

def load_data():
    df = pandas.read_excel("data2.xlsx")
    data_temp  = df[["A","B","C","D","E"]]
    label_temp = df["评级分"]
    data = []
    label = []
    DIMENSION = len(data_temp.columns)
    for i in range(df.shape[0]):
        data.append(data_temp.iloc[i].to_list())
        #对训练集标签进行独热编码
        temp = []
        for j in range(DIMENSION):
            temp.append(0)
        index = label_temp[i]-1
        temp[index] = 1
        # temp.append(label_temp[i])
        label.append(temp)
    data = numpy.array(data)
    label = numpy.array(label)
    return data,label

def load_data_primal():
    df = pandas.read_excel("data2.xlsx")
    data_temp  = df[["A","B","C","D","E"]]
    label_temp = df["最终得分"]
    data = []
    label = []
    for i in range(df.shape[0]):
        data.append(data_temp.iloc[i].to_list())
        temp = []
        temp.append(label_temp[i])
        label.append(temp)
    data = numpy.array(data)
    label = numpy.array(label)
    return data,label

def test():
    # 创建训练数据集
    X_train = numpy.array([[0, 0],
                           [0, 1],
                           [1, 0],
                           [1, 1]])
    y_train = numpy.array([[0],
                           [1],
                           [1],
                           [0]])

    # 创建测试数据集
    X_test = numpy.array([[0, 0],
                          [0, 1],
                          [1, 0],
                          [1, 1]])
    y_test = numpy.array([[0],
                          [1],
                          [1],
                          [0]])
    learning_ration = 0.01
    network = BPNet_one_output(2, 10, 1)
    network.train(X_train, y_train, learning_ration, 50000)
    print(network.predict(X_test))
    network.show_loss()

def mytest():
    data, label = load_data_primal()
    # print(data[0:10],"\n",label[0:10])
    # 划分训练集与测试集
    data_train = data[0:int(len(data) * 3 / 4)]
    data_train_train = data_train[0:int(len(data_train) * 3 / 4)]
    data_train_test = data_train[int(len(data_train) * 1 / 4) * (-1):-1]
    data_test = data[int(len(data) * 1 / 4) * (-1):-1]

    label_train = label[0:int(len(label) * 3 / 4)]
    label_train_train = label_train[0:int(len(data_train) * 3 / 4)]
    label_train_test = label_train[int(len(data_train) * 1 / 4) * (-1):-1]
    label_test = label[int(len(label) * 1 / 4) * (-1):-1]

    # 创建一个包含两个隐含层的BP神经网络
    network = BPNet_one_output(5, 10, 1)
    # 训练模型
    network.train(data_train, label_train, 0.01, 10000)
    network.show_loss()
    network.show_accuracy()
    # 预测结果
    result = network.predict(data_test)
    print(result)
    print(label_test)
    acc = network.caculate_accuracy_primal(result,label_test)
    print("准确率是{:.2f}%".format(acc*100))

if __name__ == '__main__':
    mytest()

五、其他问题

在训练过程中可能出现以下两种情况，损失曲线并没像我们期望的那样随着训练的进行而减少。出现以下问题的原因有两个：

①损失函数选用不合适。（选用了交叉熵函数）
②模型训练顺序不正确。（在train方法中）

5.1 损失曲线呈现为一条水平直线

在这里插入图片描述

5.2 损失曲线不降反增

在这里插入图片描述

5.3 更正方法

①调整train方法中代码顺序；
②将train方法中的损失函数改为“方差”。

    #训练数据集
    def train(self,data,label,learning_ration,epoch):
        for i in range(epoch):
            output = self.forward(data)
            self.backward(data, label, learning_ration)

            loss = self.loss_mse(label, output)
            # loss = self.loss_cross_entropy(label, output)
            self.loss.append(loss)
            accuary = self.caculate_accuracy_primal(output, label)
            self.accuracy.append(accuary)
            # print("accuary:",accuary)
            # self.show_weights()

进击的墨菲特

关注

34
点赞
踩
30

收藏

觉得还不错? 一键收藏
打赏
2
评论
numpy手写BP神经网络——分类问题

使用numpy手写BP神经网络解决分类问题
复制链接

扫一扫

numpy手写BP神经网络——分类问题

文章目录

前言

一、问题描述

1.1 模型预测准确率不高的原因

1.2 解决方案

二、python代码

2.1 BP神经网络工作流程

2.2 初始化参数

2.3 前向传播

2.3.1 激活函数-sigmod

2.3.2 前向传播代码

2.4 反向传播（最重要步骤）

2.4.1 激活函数sigmod导数

2.4.2 损失函数-方差

2.4.3 反向传播代码

2.5 训练模型

2.6 预测结果

2.7 损失曲线与准确率曲线

三、程序测试

3.1 加载数据

3.2 划分数据集并启动模型

3.3 模型效果分析

3.3.1 损失曲线

3.3.2 准确率曲线

3.3.3 预测准确率

四、完整代码

五、其他问题

5.1 损失曲线呈现为一条水平直线

5.2 损失曲线不降反增

5.3 更正方法