极限学习机原理及Python实现(ELM)

文章介绍了极限学习机(ELM)的基本原理,它是一种单隐层神经网络,权重和偏置随机生成后不再调整。作者提供了一个Python版本的ELM实现,包括代码和测试案例,分别用于回归和分类任务。代码中包含了激活函数的选择和数据预处理步骤,并展示了如何计算RMSE和分类准确性。
摘要由CSDN通过智能技术生成

最近在看黄广斌教授的ELM原文,但是原文中的代码链接已经失效,各种途径搜索终于找到了黄教授当时写的代码,但由于是Matlab版本,诸多不便,于是自己动手写了一个Python版本。

基本原理

极限学习机(Extreme Learning Machine,简称ELM)是一种单隐层神经网络,它的主要特点是隐含层节点的权重和偏置系数可以随机生成,生成之后就不再调整,唯一需要确定的是输出层的权重系数 β \boldsymbol \beta β

假设隐含层输出为 H \boldsymbol H H,输出层的输出(最后输出)为 T \boldsymbol T T,则输出层的权重系数 β \boldsymbol \beta β满足: H β = T \boldsymbol H\boldsymbol \beta = \boldsymbol T Hβ=T β = H † T \boldsymbol \beta=\boldsymbol H^\dagger\boldsymbol T β=HT,其中 H † \boldsymbol H^\dagger H为广义逆。

极限学习机的网络结构很简单,如下图所示。
ELM网络结构
其中 w \boldsymbol w w b \boldsymbol b b分别是隐含层节点的权重系数矩阵和偏置系数向量, β \boldsymbol \beta β是输出层的权重系数矩阵。
假设现在有一组样本 ( X , T ) (\boldsymbol X, \boldsymbol T) (X,T) X \boldsymbol X X表示数据,维度为 n × d n\times d n×d,其中 n n n为样本个数, d d d为特征个数。
输入矩阵
T \boldsymbol T T表示标签,维度为 n × t n\times t n×t
输出矩阵
如果隐含层节点个数为 m m m,则权重系数矩阵维度为 d × m d\times m d×m
权重系数矩阵
偏置系数矩阵维度为 1 × m 1\times m 1×m
在这里插入图片描述

记隐含层节点输出为 H \boldsymbol H H H = g ( X w + b ) \boldsymbol H = g(\boldsymbol X \boldsymbol w + \boldsymbol b) H=g(Xw+b),其中 g ( ⋅ ) g(·) g()为激活函数,心细推导的同学可能会发现这里的矩阵维度根部不能相加,一个矩阵加上一个向量,是的!但是在numpy中可以,利用numpy的广播机制可以实现维度相等的矩阵和向量的相加,这里只是偷懒一点点,同时也方便原理的理解,矩阵表示如下:
隐层输出矩阵
据此,输出矩阵 T = H β \boldsymbol T=\boldsymbol H \boldsymbol \beta T=Hβ

代码实现

代码实现的过程与上述原理有一点区别的就是隐含层系数矩阵 b \boldsymbol b b,原理推导的过程是一个向量,但是实际过程中为了方便,生成了一个矩阵。
使用的时候将下面代码保存为文件elm.py

# -*- coding:utf-8 -*-
# keep learning, build yourself.
# author: Mobius
# 2023/4/14 9:00
import numpy as np
from numpy.linalg import pinv
from sklearn.preprocessing import OneHotEncoder

class ELM:
    def __init__(self, hiddenNodeNum, activationFunc="sigmoid", type_="CLASSIFIER"):
        # beta矩阵
        self.beta = None
        # 偏置矩阵
        self.b = None
        # 权重矩阵
        self.W = None
        # 隐含层节点个数
        self.hiddenNodeNum = hiddenNodeNum
        # 激活函数
        self.activationFunc = self.chooseActivationFunc(activationFunc)
        # 极限学习机类别   :CLASSIFIER->分类, REGRESSION->回归
        self.type_ = type_

    def fit(self, X, T):
        if self.type_ == "REGRESSION":
            try:
                if T.shape[1] > 1:
                    raise ValueError("回归问题的输出维度必须为1")
            except IndexError:
                # 如果数据是一个array,则转换为列向量
                T = np.array(T).reshape(-1, 1)
        if self.type_ == "CLASSIFIER":
            # 独热编码器
            encoder = OneHotEncoder()
            # 将输入的T转换为独热编码的形式
            T = encoder.fit_transform(T.reshape(-1, 1)).toarray()
        # 输入维度d,输出维度m,样本个数N,隐含层节点个数hiddenNodeNum
        n, d = X.shape
        # 权重系数矩阵 d*hiddenNodeNum
        self.W = np.random.uniform(-1.0, 1.0, size=(d, self.hiddenNodeNum))
        # 偏置系数矩阵 n*hiddenNodeNum
        self.b = np.random.uniform(-0.4, 0.4, size=(1, self.hiddenNodeNum))
        # 隐含层输出矩阵 n*hiddenNodeNum
        H = self.activationFunc(np.dot(X, self.W) + self.b)
        # 输出权重系数 hiddenNodeNum*m,β的计算公式为:((H.T*H)^-1)*H.T*T
        self.beta = np.dot(np.dot(pinv(np.dot(H.T, H)), H.T), T)

    def chooseActivationFunc(self, activationFunc):
        """选择激活函数,这里返回的值是函数名"""
        if activationFunc == "sigmoid":
            return self._sigmoid
        elif activationFunc == "sin":
            return self._sine
        elif activationFunc == "cos":
            return self._cos

    def predict(self, x):
        h = self.activationFunc(np.dot(x, self.W) + self.b)
        res = np.dot(h, self.beta)
        if self.type_ == "REGRESSION":  # 回归预测
            return res
        elif self.type_ == "CLASSIFIER":  # 分类预测
            # 返回最大值所在位置的索引,因为最大值位置的类别恰好等于索引
            return np.argmax(res, axis=1)

    @staticmethod
    def score(y_true, y_pred):
        # 根据输出标签相等的个数计算得分
        if len(y_pred) != len(y_true):
            raise ValueError("维度不相等")
        totalNum = len(y_pred)
        rightNum = np.sum([1 if p == t else 0 for p, t in zip(y_pred, y_true)])
        return rightNum / totalNum

    @staticmethod
    def RMSE(y_pred, y_true):
        # Root Mean Square Error    均方根误差
        # 这里计算平均均方根误差
        # 计算公式参考:https://blog.csdn.net/yql_617540298/article/details/104212354
        try:
            if y_pred.shape[1] == 1:
                y_pred = y_pred.reshape(-1)
        except IndexError:
            pass
        
        return np.sqrt(np.sum(np.square(y_pred - y_true)) / len(y_pred))

    @staticmethod
    def _sigmoid(x):
        return 1.0 / (1 + np.exp(-x))

    @staticmethod
    def _sine(x):
        return np.sin(x)

    @staticmethod
    def _cos(x):
        return np.cos(x)

测试案例

基于自己编写的ELM进行测试,分别用于回归和分类测试。

回归

# 使用波士顿房价数据集进行回归测试
import numpy as np
from sklearn.model_selection import train_test_split
import pandas as pd
from elm import ELM
from sklearn.preprocessing import StandardScaler


data = pd.read_csv("./housing.csv", delim_whitespace=True, header=None)
# 标准化处理
scaler = StandardScaler()
X = data.iloc[:, :-1]
# 将数据标准化处理
X = scaler.fit_transform(X)
# 输出值不作处理
y = data.iloc[:, -1]
# 随机划分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

elm = ELM(hiddenNodeNum=20, activationFunc="sigmoid", type_="REGRESSION")
elm.fit(X_train, y_train)
y_pred = elm.predict(X_test)
rmse = elm.RMSE(y_pred, y_test)
print("平均RMSE为:", rmse)
平均RMSE为: 5.704739831187914

分类

import numpy as np
from sklearn.model_selection import train_test_split
from elmimport ELM
from sklearn import datasets

# -------------------------------------------------------------

# # # 采用鸢尾花数据集进行分类测试
# 导入文件
iris = datasets.load_iris()
X = iris.data
y = iris.target
# 划分数据集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

elm = ELM(hiddenNodeNum=20, activationFunc="sigmoid", type_="CLASSIFIER")
elm.fit(X_train, y_train)
y_pred = elm.predict(X_test)
print("训练得分:", elm.score(y_test, y_pred))

输出

训练得分: 0.9210526315789473

由于系数矩阵每次都是随机生成的,所以每次的准确率都会有所变化。

其中,波士顿房价数据集是从kaggle官网下载的。

附录
黄教授代码:

REGRESSION=0;
CLASSIFIER=1;

%%%%%%%%%%% Load training dataset
train_data=load(TrainingData_File);
T=train_data(:,1)';
P=train_data(:,2:size(train_data,2))';
clear train_data;                                   %   Release raw training data array

%%%%%%%%%%% Load testing dataset
test_data=load(TestingData_File);
TV.T=test_data(:,1)';
TV.P=test_data(:,2:size(test_data,2))';
clear test_data;                                    %   Release raw testing data array

NumberofTrainingData=size(P,2);
NumberofTestingData=size(TV.P,2);
NumberofInputNeurons=size(P,1);

if Elm_Type~=REGRESSION
    %%%%%%%%%%%% Preprocessing the data of classification
    sorted_target=sort(cat(2,T,TV.T),2);
    label=zeros(1,1);                               %   Find and save in 'label' class label from training and testing data sets
    label(1,1)=sorted_target(1,1);
    j=1;
    for i = 2:(NumberofTrainingData+NumberofTestingData)
        if sorted_target(1,i) ~= label(1,j)
            j=j+1;
            label(1,j) = sorted_target(1,i);
        end
    end
    number_class=j;
    NumberofOutputNeurons=number_class;
       
    %%%%%%%%%% Processing the targets of training
    temp_T=zeros(NumberofOutputNeurons, NumberofTrainingData);
    for i = 1:NumberofTrainingData
        for j = 1:number_class
            if label(1,j) == T(1,i)
                break; 
            end
        end
        temp_T(j,i)=1;
    end
    T=temp_T*2-1;

    %%%%%%%%%% Processing the targets of testing
    temp_TV_T=zeros(NumberofOutputNeurons, NumberofTestingData);
    for i = 1:NumberofTestingData
        for j = 1:number_class
            if label(1,j) == TV.T(1,i)
                break; 
            end
        end
        temp_TV_T(j,i)=1;
    end
    TV.T=temp_TV_T*2-1;

end                                                

start_time_train=cputime;

%%%%%%%%%%% Random generate input weights InputWeight (w_i) and biases BiasofHiddenNeurons (b_i) of hidden neurons
InputWeight=rand(NumberofHiddenNeurons,NumberofInputNeurons)*2-1;
BiasofHiddenNeurons=rand(NumberofHiddenNeurons,1);
tempH=InputWeight*P;
clear P;                                            %   Release input of training data 
ind=ones(1,NumberofTrainingData);
BiasMatrix=BiasofHiddenNeurons(:,ind);              %   Extend the bias matrix BiasofHiddenNeurons to match the demention of H
tempH=tempH+BiasMatrix;

%%%%%%%%%%% Calculate hidden neuron output matrix H
switch lower(ActivationFunction)
    case {'sig','sigmoid'}
        %%%%%%%% Sigmoid 
        H = 1 ./ (1 + exp(-tempH));
    case {'sin','sine'}
        %%%%%%%% Sine
        H = sin(tempH);    
    case {'hardlim'}
        %%%%%%%% Hard Limit
        H = double(hardlim(tempH));
    case {'tribas'}
        %%%%%%%% Triangular basis function
        H = tribas(tempH);
    case {'radbas'}
        %%%%%%%% Radial basis function
        H = radbas(tempH);
        %%%%%%%% More activation functions can be added here                
end
clear tempH;                                        %   Release the temparary array for calculation of hidden neuron output matrix H

%%%%%%%%%%% Calculate output weights OutputWeight (beta_i)
OutputWeight=pinv(H') * T';                        % implementation without regularization factor //refer to 2006 Neurocomputing paper

end_time_train=cputime;
TrainingTime=end_time_train-start_time_train;        %   Calculate CPU time (seconds) spent for training ELM

%%%%%%%%%%% Calculate the training accuracy
Y=(H' * OutputWeight)';                             %   Y: the actual output of the training data
if Elm_Type == REGRESSION
    TrainingAccuracy=sqrt(mse(T - Y))              %   Calculate training accuracy (RMSE) for regression case
end
clear H;

%%%%%%%%%%% Calculate the output of testing input
start_time_test=cputime;
tempH_test=InputWeight*TV.P;
clear TV.P;             %   Release input of testing data             
ind=ones(1,NumberofTestingData);
BiasMatrix=BiasofHiddenNeurons(:,ind);              
tempH_test=tempH_test + BiasMatrix;
switch lower(ActivationFunction)
    case {'sig','sigmoid'}
        %%%%%%%% Sigmoid 
        H_test = 1 ./ (1 + exp(-tempH_test));
    case {'sin','sine'}
        %%%%%%%% Sine
        H_test = sin(tempH_test);        
    case {'hardlim'}
        %%%%%%%% Hard Limit
        H_test = hardlim(tempH_test);        
    case {'tribas'}
        %%%%%%%% Triangular basis function
        H_test = tribas(tempH_test);        
    case {'radbas'}
        %%%%%%%% Radial basis function
        H_test = radbas(tempH_test);        
        %%%%%%%% More activation functions can be added here        
end
TY=(H_test' * OutputWeight)';                       %   TY: the actual output of the testing data
end_time_test=cputime;
TestingTime=end_time_test-start_time_test;           

if Elm_Type == REGRESSION
    TestingAccuracy=sqrt(mse(TV.T - TY))           %   Calculate testing accuracy (RMSE) for regression case
end

if Elm_Type == CLASSIFIER
%%%%%%%%%% Calculate training & testing classification accuracy
    MissClassificationRate_Training=0;
    MissClassificationRate_Testing=0;

    for i = 1 : size(T, 2)
        [x, label_index_expected]=max(T(:,i));
        [x, label_index_actual]=max(Y(:,i));
        if label_index_actual~=label_index_expected
            MissClassificationRate_Training=MissClassificationRate_Training+1;
        end
    end
    TrainingAccuracy=1-MissClassificationRate_Training/size(T,2);
    for i = 1 : size(TV.T, 2)
        [x, label_index_expected]=max(TV.T(:,i));
        [x, label_index_actual]=max(TY(:,i));
        if label_index_actual~=label_index_expected
            MissClassificationRate_Testing=MissClassificationRate_Testing+1;
        end
    end
    TestingAccuracy=1-MissClassificationRate_Testing/size(TV.T,2);  
end
  • 11
    点赞
  • 86
    收藏
    觉得还不错? 一键收藏
  • 30
    评论
ELM极限学习机是一种通过随机选择输入权重和分析以确定网络的输出权重的学习算法,最初是对单隐层前馈神经网络提出的一种新型的学习算法。在Python中,可以使用numpy库来实现ELM极限学习机算法。 在使用ELM极限学习机进行训练时,需要准备一个包含所有训练模式的矩阵training_patterns。这个矩阵的大小是Nxd,其中N表示训练模式的数量,d表示每个训练模式的维度。需要保存所有训练模式,以便在测试和预测阶段进行核计算。 训练完成后,可以计算输出权重output_weight,它是一个大小为Nx1的列向量,表示Β(beta)。这个向量可以用于在测试和预测阶段进行结果的计算。 在Python中,可以使用numpy库来进行矩阵运算和计算Β(beta)。可以通过调用相应的函数来实现ELM极限学习机算法,并传入训练模式矩阵和其他必要的参数。最后,可以使用得到的输出权重进行测试和预测。 总结来说,ELM极限学习机是一种通过随机选择输入权重和分析以确定网络的输出权重的学习算法,可以使用Python中的numpy库来实现。训练模式矩阵包含所有训练模式,输出权重是计算出来的列向量,可以用于测试和预测阶段的结果计算。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [Python实现极限学习机ELM【hpelm库】(内涵源代码)](https://blog.csdn.net/weixin_44333889/article/details/122171575)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] - *2* *3* [极限学习机Python开源库——elm【内附案例源码】](https://blog.csdn.net/weixin_44333889/article/details/124844604)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论 30
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值