elm预测matlab,机器学习——极限学习(ELM)matlab代码分析

Hello,大家好,我是小鹏同学。今天在一个网站(https://www.ntu.edu.sg/home/egbhuang/elm_random_hidden_nodes.html)上下载了基本的ELM的代码和训练集以及测试集,并阅读了一下代码,发现还是比较简单的,只要花点时间认真耐心阅读、在matlab上跑一跑应该都能很容易读懂。

好了,长话短说,我直接给出我在代码中加的注释吧,希望对大家理解ELM的Matlab代码有小小帮助。如果我在代码中的解释存在错误也欢迎大家指出。

1.对于分类的main.m(适合于二元及多元分类)

clear all;

close all;

clc;

[TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = ELM('diabetes_train', 'diabetes_test', 1, 20, 'sig')

2.对于回归的main.m

clear all;

close all;

clc;

[TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = ELM('sinc_train', 'sinc_test', 0, 20, 'sig')

3.核心代码ELM.m

function [TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = ELM(TrainingData_File, TestingData_File, Elm_Type, NumberofHiddenNeurons, ActivationFunction)

% Usage: elm(TrainingData_File, TestingData_File, Elm_Type, NumberofHiddenNeurons, ActivationFunction)

% OR: [TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = elm(TrainingData_File, TestingData_File, Elm_Type, NumberofHiddenNeurons, ActivationFunction)

%

% Input:

% TrainingData_File - Filename of training data set

% TestingData_File - Filename of testing data set

% Elm_Type - 0 for regression; 1 for (both binary and multi-classes) classification

% NumberofHiddenNeurons - Number of hidden neurons assigned to the ELM

% ActivationFunction - Type of activation function:

% 'sig' for Sigmoidal function

% 'sin' for Sine function

% 'hardlim' for Hardlim function

% 'tribas' for Triangular basis function

% 'radbas' for Radial basis function (for additive type of SLFNs instead of RBF type of SLFNs)

%

% Output:

% TrainingTime - Time (seconds) spent on training ELM

% TestingTime - Time (seconds) spent on predicting ALL testing data

% TrainingAccuracy - Training accuracy:

% RMSE for regression or correct classification rate for classification

% TestingAccuracy - Testing accuracy:

% RMSE for regression or correct classification rate for classification

%

% MULTI-CLASSE CLASSIFICATION: NUMBER OF OUTPUT NEURONS WILL BE AUTOMATICALLY SET EQUAL TO NUMBER OF CLASSES

% FOR EXAMPLE, if there are 7 classes in all, there will have 7 output

% neurons; neuron 5 has the highest output means input belongs to 5-th class

%

% Sample1 regression: [TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = elm('sinc_train', 'sinc_test', 0, 20, 'sig')

% Sample2 classification: elm('diabetes_train', 'diabetes_test', 1, 20, 'sig')

%

%%%% Authors: MR QIN-YU ZHU AND DR GUANG-BIN HUANG

%%%% NANYANG TECHNOLOGICAL UNIVERSITY, SINGAPORE

%%%% EMAIL: EGBHUANG@NTU.EDU.SG; GBHUANG@IEEE.ORG

%%%% WEBSITE: http://www.ntu.edu.sg/eee/icis/cv/egbhuang.htm

%%%% DATE: APRIL 2004

%%%%%%%%%%% Macro definition 宏定义

REGRESSION=0;

CLASSIFIER=1;

%%%%%%%%%%% Load training dataset

train_data=load(TrainingData_File);

T=train_data(:,1)'; % First column are the expected output (target) for regression and classification applications 加了转置,大小为:1行*样本数量

P=train_data(:,2:size(train_data,2))'; % 获取属性列 加了转置

clear train_data; % Release raw training data array

%%%%%%%%%%% Load testing dataset

test_data=load(TestingData_File);

TV.T=test_data(:,1)';

TV.P=test_data(:,2:size(test_data,2))';

clear test_data; % Release raw testing data array

NumberofTrainingData=size(P,2); % 训练集大小

NumberofTestingData=size(TV.P,2); % 测试集大小

NumberofInputNeurons=size(P,1); % 输入神经元数量,即属性个数

% 如果不是逻辑回归,即分类问题

if Elm_Type~=REGRESSION

%%%%%%%%%%%% Preprocessing the data of classification

sorted_target=sort(cat(2,T,TV.T),2); %训练集和测试的标签连起来并按从小到大顺序排列,组成一个行向量

label=zeros(1,1); % Find and save in 'label' class label from training and testing data sets

label(1,1)=sorted_target(1,1);

j=1;

for i = 2:(NumberofTrainingData+NumberofTestingData) % 利用循环把第一类标签统一到(1,1)第二类统一到(1,2), sorted_target已经从小到大排列

if sorted_target(1,i) ~= label(1,j)

j=j+1;

label(1,j) = sorted_target(1,i);

end

end

number_class=j; % 类的数量

NumberofOutputNeurons=number_class; % 类的数量赋值给输出神经元的数量

%%%%%%%%%% Processing the targets of training

temp_T=zeros(NumberofOutputNeurons, NumberofTrainingData); % 输出神经元组成矩阵的一列,用于暂时存储训练集的输出

for i = 1:NumberofTrainingData % 将每个训练样本的标签弄到temp_T里。如总共有5个类,第一个训练样本属于第二个类,则temp_T第一列为[0;1;0;0;0]

for j = 1:number_class

if label(1,j) == T(1,i)

break;

end

end

temp_T(j,i)=1;

end

T=temp_T*2-1; % temp_T矩阵的每个元素的数变化一下,如对于二分类,值为-1 或者1;T的大小变为标签数量*训练样本数量

%%%%%%%%%% Processing the targets of testing 方法跟处理训练集标签一样

temp_TV_T=zeros(NumberofOutputNeurons, NumberofTestingData);

for i = 1:NumberofTestingData

for j = 1:number_class

if label(1,j) == TV.T(1,i)

break;

end

end

temp_TV_T(j,i)=1;

end

TV.T=temp_TV_T*2-1;

end % end if of Elm_Type

%%%%%%%%%%% Calculate weights & biases

start_time_train=cputime; % 计算开始训练时刻,训练开始

%%%%%%%%%%% Random generate input weights InputWeight (w_i) and biases BiasofHiddenNeurons (b_i) of hidden neurons

%%%%%%%%%%% 随机产生隐层神经元的输入权 InputWeight (w_i)、偏置BiasofHiddenNeurons (b_i)

InputWeight=rand(NumberofHiddenNeurons,NumberofInputNeurons)*2-1; % 输入权重是一个 隐层神经元数量*输入神经元数量 的矩阵,元素InputWeight(l,n)就表示输入n与隐层l之间的权重

% NumberofHiddenNeurons由主函数指定,NumberofInputNeurons输入神经元的数量(即属性数量)由上面代码得到

BiasofHiddenNeurons=rand(NumberofHiddenNeurons,1); % NumberofHiddenNeurons由主函数指定,BiasofHiddenNeurons是一个列向量,行数等于隐层神经元数量

tempH=InputWeight*P; % tempH是一个隐层数*训练样本数的矩阵

clear P; % Release input of training data

ind=ones(1,NumberofTrainingData); % 元素为1的行向量

BiasMatrix=BiasofHiddenNeurons(:,ind); % Extend the bias matrix BiasofHiddenNeurons to match the demention of H.

% 扩展偏置矩阵BiasofHiddenNeurons以匹配H的维数,有行数等于隐层列数等于1扩展成行数等于隐层列数等于训练样本数(与tempH一样),没一行扩展的数据等于每行的第一个

tempH=tempH+BiasMatrix; % tempH作用于一个函数即为隐层输出

%%%%%%%%%%% Calculate hidden neuron output matrix H (计算输出矩阵H)

switch lower(ActivationFunction) % ActivationFunction由用户在主函数指定

case {'sig','sigmoid'}

%%%%%%%% Sigmoid

H = 1 ./ (1 + exp(-tempH)); % H即为隐层输出,是一个隐层数*训练样本数的矩阵

case {'sin','sine'}

%%%%%%%% Sine

H = sin(tempH);

case {'hardlim'}

%%%%%%%% Hard Limit

H = double(hardlim(tempH));

case {'tribas'}

%%%%%%%% Triangular basis function

H = tribas(tempH);

case {'radbas'}

%%%%%%%% Radial basis function

H = radbas(tempH);

%%%%%%%% More activation functions can be added here

end

clear tempH; % Release the temparary array for calculation of hidden neuron output matrix H

%%%%%%%%%%% Calculate output weights OutputWeight (beta_i)

%%%%%%%%%%% 计算输出权重β(大小为:隐层数*标签数量),β(l,m)为隐层l与输出层m的权重

OutputWeight=pinv(H') * T'; % implementation without regularization factor //refer to 2006 Neurocomputing paper

% H是一个隐层数*训练样本数的矩阵;pinv(H')是求H的广义逆矩阵,大小也为隐层数*训练样本数

% 标签矩阵T的大小在标签处理步骤中的85行变为:标签数量*训练样本数量,故OutputWeight(即笔记中的beta)大小为:隐层数*标签数量

%OutputWeight=inv(eye(size(H,1))/C+H * H') * H * T'; % faster method 1 //refer to 2012 IEEE TSMC-B paper

%implementation; one can set regularizaiton factor C properly in classification applications

%OutputWeight=(eye(size(H,1))/C+H * H') \ H * T'; % faster method 2 //refer to 2012 IEEE TSMC-B paper

%implementation; one can set regularizaiton factor C properly in classification applications

%If you use faster methods or kernel method, PLEASE CITE in your paper properly:

%Guang-Bin Huang, Hongming Zhou, Xiaojian Ding, and Rui Zhang, "Extreme Learning Machine for Regression and Multi-Class Classification," submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, October 2010.

end_time_train=cputime; % 训练完成,计算结束时刻

TrainingTime=end_time_train-start_time_train; % Calculate CPU time (seconds) spent for training ELM 计算训练耗时

%%%%%%%%%%% Calculate the training accuracy

Y=(H' * OutputWeight)'; % Y: the actual output of the training data. H是一个隐层数*训练样本数的矩阵;

% OutputWeight隐层数*标签数量,故Y的大小为:标签数量*训练样本数量,与真实标签矩阵大小一样

if Elm_Type == REGRESSION

TrainingAccuracy=sqrt(mse(T - Y)); % Calculate training accuracy (RMSE) for regression case

end

clear H;

%%%%%%%%%%% Calculate the output of testing input

start_time_test=cputime; % 计算开始测试时刻

tempH_test=InputWeight*TV.P;

clear TV.P; % Release input of testing data

ind=ones(1,NumberofTestingData);

BiasMatrix=BiasofHiddenNeurons(:,ind); % Extend the bias matrix BiasofHiddenNeurons to match the demention of H

tempH_test=tempH_test + BiasMatrix;

switch lower(ActivationFunction)

case {'sig','sigmoid'}

%%%%%%%% Sigmoid

H_test = 1 ./ (1 + exp(-tempH_test));

case {'sin','sine'}

%%%%%%%% Sine

H_test = sin(tempH_test);

case {'hardlim'}

%%%%%%%% Hard Limit

H_test = hardlim(tempH_test);

case {'tribas'}

%%%%%%%% Triangular basis function

H_test = tribas(tempH_test);

case {'radbas'}

%%%%%%%% Radial basis function

H_test = radbas(tempH_test);

%%%%%%%% More activation functions can be added here

end

TY=(H_test' * OutputWeight)'; % TY: the actual output of the testing data. OutputWeight是用训练集训练出来的

end_time_test=cputime; % 结束测试时刻

TestingTime=end_time_test-start_time_test; % Calculate CPU time (seconds) spent by ELM predicting the whole testing data

if Elm_Type == REGRESSION

TestingAccuracy=sqrt(mse(TV.T - TY)); % Calculate testing accuracy (RMSE) for regression case

end

if Elm_Type == CLASSIFIER % 如果是classification问题

%%%%%%%%%% Calculate training & testing classification accuracy

MissClassificationRate_Training=0; % 在训练集中预测错误的样本数量

MissClassificationRate_Testing=0; % 在测试集中预测错误的样本数量

% 计算TrainingAccuracy

for i = 1 : size(T, 2) % 真实标签矩阵T的大小在标签处理步骤中的85行变为:标签数量*训练样本数量,故for循环从1到样本数量,即遍历每个样本

[x, label_index_expected]=max(T(:,i)); % T是真实标签矩阵。x存储最大值,label_index_expected存储所在位置号(即样本的类别) ; 如果总共有7个类别,将有7个输出神经元。神经元5的输出最高,意味着输入属于5类

[x, label_index_actual]=max(Y(:,i)); % Y是对训练集预测的标签,TY是对测试机预测的标签

if label_index_actual~=label_index_expected

MissClassificationRate_Training=MissClassificationRate_Training+1; % 如果预测错误,则MissClassificationRate_Training加1

end

end

TrainingAccuracy=1-(MissClassificationRate_Training/size(T,2)); % 计算classification问题的训练精度;size(T,2)即为训练样本数量

% 计算TestingAccuracy

for i = 1 : size(TV.T, 2)

[x, label_index_expected]=max(TV.T(:,i));

[x, label_index_actual]=max(TY(:,i));

if label_index_actual~=label_index_expected

MissClassificationRate_Testing=MissClassificationRate_Testing+1;

end

end

TestingAccuracy=1-MissClassificationRate_Testing/size(TV.T,2);

end

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值