对“视觉机器学习20讲配套仿真代码”的研究心得---Adaboost（二）

最新推荐文章于 2020-09-17 20:29:04 发布

zouyu409709312

最新推荐文章于 2020-09-17 20:29:04 发布

阅读量1.3k

点赞数 1

本文链接：https://blog.csdn.net/zouyu409709312/article/details/51296172

版权

tree_node_w.m

% This file is part of GML Matlab Toolbox
% For conditions of distribution and use, see the accompanying License.txt file.
%
% tree_node_w Implements the constructor for tree_node_w class, that
% imlements classification tree
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
% tree_node = tree_node_w(max_split)
% ---------------------------------------------------------------------------------
% Arguments:
% max_split - maximum number of splits in the tree
% Return:
% tree_node - object of tree_node_w class

此文件是GML MATLAB工具箱的一部分
%的分布及使用条件，见所附license.txt文件。
%
为实现tree_node_w tree_node_w类的构造函数，
分为分类树
% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
% tree_node = tree_node_w（max_split）
% ---------------------------------------------------------------------------------
分论点：
% max_split在树的最大分裂数
回报率%：
% tree_node - tree_node_w类对象

function tree_node = tree_node_w(max_split)

tree_node.left_constrain = [];

这在MATLAB中表示把分裂树的左右节点赋值为空。

matlab a=[ ]是什么意思_百度知道

讲a赋值为一个空阵,什么元素也没有

tree_node.right_constrain = [];
tree_node.dim = [];
tree_node.max_split = max_split; 前面有max_split这个变量的值。
tree_node.parent = [];

tree_node = class(tree_node, 'tree_node_w') ; 这里这句话的意思就是，创建一个名字为'tree_node_w'的类。结构的特点与tree_node类似。

%CLASS Return class name of object.
% S = CLASS(OBJ) returns the name of the class of object OBJ.
%
% Possibilities are:
% double -- Double precision floating point number array
% (this is the traditional MATLAB matrix or array)
% single -- Single precision floating point number array
% logical -- Logical array
% char -- Character array
% cell -- Cell array
% struct -- Structure array
% function_handle -- Function Handle
% int8 -- 8-bit signed integer array
% uint8 -- 8-bit unsigned integer array
% int16 -- 16-bit signed integer array
% uint16 -- 16-bit unsigned integer array
% int32 -- 32-bit signed integer array
% uint32 -- 32-bit unsigned integer array
% int64 -- 64-bit signed integer array
% uint64 -- 64-bit unsigned integer array
% <class_name> -- MATLAB class name for MATLAB objects
% <java_class> -- Java class name for java objects
%
% %Example 1: Obtain the name of the class of value PI
% name = class(PI);
%
% %Example 2: Obtain the full name of a package-based java class
% import java.lang.*;
% obj = String('mystring');
% class(obj)
%
% For classes created without a CLASSDEF statement (pre-MATLAB version
% 7.6 syntax), CLASS invoked within a constructor method creates an
% object of type 'class_name'. Constructor methods are functions saved
% in a file named <class_name>.m and placed in a directory named
% @<class_name>. Note that 'class_name' must be the second argument to
% CLASS. Uses of CLASS for this purpose are shown below.
%
% O = CLASS(S,'class_name') creates an object of class 'class_name'
% from the structure S.
%
% O = CLASS(S,'class_name',PARENT1,PARENT2,...) also inherits the
% methods and fields of the parent objects PARENT1, PARENT2, ...
%
% O = CLASS(struct([]),'class_name',PARENT1,PARENT2,...), specifying
% an empty structure S, creates an object that inherits the methods and
% fields from one or more parent classes, but does not have any
% additional fields beyond those inherited from the parents.
%
% See also ISA, SUPERIORTO, INFERIORTO, CLASSDEF, STRUCT.

% Copyright 1984-2008 The MathWorks, Inc.
% $Revision: 1.17.4.2 $ $Date: 2008/03/24 18:08:32 $
% Built-in function.

train.m　　　　http://blog.csdn.net/windtalkersm/article/details/8749454

% This file is part of GML Matlab Toolbox
% For conditions of distribution and use, see the accompanying License.txt file.
%
% train Implements training of a classification tree
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
% nodes = train(node, dataset, labels, weights)
% ---------------------------------------------------------------------------------
% Arguments:
% node - object of tree_node_w class (initialized properly)
% dataset - training data
% labels - training labels
% weights - weights of training data
% Return:
% nodes - tree is represented as a cell array of its nodes

%这个文件是GML MATLAB工具箱的一部分
%的分布及使用条件，见所附license.txt文件。
%
%分裂树的实现过程
% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
%节点=（节点，数据集，标签，权重）
% ---------------------------------------------------------------------------------
分论结点：
%- tree_node_w类对象（正确初始化）
%数据集训练数据
%标签-训练标签
训练数据的权重
回报率%：
%节点树被表示为节点的单元阵列

function nodes = train(node, dataset, labels, weights)节点，数据集，标签，权重

max_split = node.max_split; 节点分裂

[left right spit_error] = do_learn_nu(node, dataset, labels, weights); do_learn_nu这是一个学习的函数，下面会有介绍。

nodes = {left, right};

left_pos = sum((calc_output(left , dataset) == labels) .* weights); calc_output这是一个学习的函数，下面会有介绍。
left_neg = sum((calc_output(left , dataset) == -labels) .* weights);
right_pos = sum((calc_output(right, dataset) == labels) .* weights);
right_neg = sum((calc_output(right, dataset) == -labels) .* weights);

errors = [min(left_pos, left_neg), min(right_pos, right_neg)];

if(right_pos == 0 && right_neg == 0)
return;
end

if(left_pos == 0 && left_neg == 0)
return;
end

[errors, IDX] = sort(errors);

sort(A)若A是向量不管是列还是行向量，默认都是对A进行升序排列。sort(A)是默认的升序，而sort(A,'descend')是降序排序。
sort(A)若A是矩阵，默认对A的各列进行升序排列
sort(A,dim)
dim=1时等效sort(A)

dim=2时表示对A中的各行元素升序排列

errors = flipdim(errors,2);
IDX = flipdim(IDX,2);
nodes = nodes(IDX);

%FLIPDIM Flip matrix along specified dimension.
% FLIPDIM(X,DIM) returns X with dimension DIM flipped.
% For example, FLIPDIM(X,1) where
% ------------------------------------------------------------------% flipdim翻转矩阵沿指定维。
% X = 1 4 produces 3 6
% 2 5 2 5
% 3 6 1 4
%
%
% Class support for input X:
% float: double, single
%
% See also FLIPLR, FLIPUD, ROT90, PERMUTE.

% Copyright 1984-2004 The MathWorks, Inc.
% $Revision: 1.17.4.3 $ $Date: 2010/08/23 23:07:58 $

% Argument parsing

splits = [];
split_errors = [];
deltas = [];

for i = 2 : max_split
for j = 1 : length(errors)

if(length(deltas) >= j)
continue;
end

max_node = nodes{j};
max_node_out = calc_output(max_node, dataset);

mask = find(max_node_out == 1);

find（A）返回矩阵A中非零元素所在位置

[left right spit_error] = do_learn_nu(node, dataset(:,mask), labels(mask), weights(mask), max_node);


left_pos = sum((calc_output(left , dataset) == labels) .* weights);
left_neg = sum((calc_output(left , dataset) == -labels) .* weights);
right_pos = sum((calc_output(right, dataset) == labels) .* weights);
right_neg = sum((calc_output(right, dataset) == -labels) .* weights);

splits{end+1} = left;
splits{end+1} = right;

if( (right_pos + right_neg) == 0 || (left_pos + left_neg) == 0)
deltas(end+1) = 0;
else
deltas(end+1) = errors(j) - spit_error;
end

split_errors(end+1) = min(left_pos, left_neg);
split_errors(end+1) = min(right_pos, right_neg);
end

if(max(deltas) == 0)
return;
end
best_split = find(deltas == max(deltas));
best_split = best_split(1);

cut_vec = [1 : (best_split-1) (best_split + 1) : length(errors)];
nodes = nodes(cut_vec);
errors = errors(cut_vec);
deltas = deltas(cut_vec);

nodes{end+1} = splits{2 * best_split - 1};
nodes{end+1} = splits{2 * best_split};

errors(end+1) = split_errors(2 * best_split - 1);
errors(end+1) = split_errors(2 * best_split);

cut_vec = [1 : 2 * (best_split-1) 2 * (best_split)+1 : length(split_errors)];
split_errors = split_errors(cut_vec);
splits = splits(cut_vec);

end

get_dim_and_tr.m

% This file is part of GML Matlab Toolbox
% For conditions of distribution and use, see the accompanying License.txt file.
%
% get_dim_and_tr is the function, that returns dimension and threshold of
% tree node
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
% output = get_dim_and_tr(tree_node, output)
% ---------------------------------------------------------------------------------
% Arguments:
% tree_node - a node of classification tree
% output - vector of dimensions and thresholds. Result fo
% current node would be concatinated to it
% Return:
% output - a vector of thresholds and dimensions. It has the
% following format:
% [dimension threshold left/right ...]
% left/right is [-1, +1] number, wich signifies if
% current threshold is eather left or right

%这个文件是GML MATLAB工具箱的一部分
%的分布及使用条件，见所附license.txt文件。
%
% get_dim_and_tr是函数，返回树节点维度和阈值

% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
%输出= get_dim_and_tr（tree_node，输出）
% ---------------------------------------------------------------------------------
分论点：
% tree_node -分类树的节点
%的输出向量的尺寸和阈值。结果

function output = get_dim_and_tr(tree_node, output)

if(nargin < 2)
output = [];
end

if(length(tree_node.parent) > 0)
output = get_dim_and_tr(tree_node.parent, output);
end

output(end+1) = tree_node.dim;

if( length(tree_node.right_constrain) > 0)
output(end+1) = tree_node.right_constrain;
output(end+1) = -1;
elseif( length(tree_node.left_constrain) > 0)
output(end+1) = tree_node.left_constrain; 这几行代码是非常重要的，在分裂树中+1,-1代表着不同的值。
output(end+1) = +1;
end

do_learn_nu.m　　　　　　　　　　－－　　　－http://blog.csdn.net/windtalkersm/article/details/8763206

% This file is part of GML Matlab Toolbox
% For conditions of distribution and use, see the accompanying License.txt file.
%
% do_learn_nu Implements splitting of tree node
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
% [tree_node_left, tree_node_right, split_error] =
% do_learn_nu(tree_node, dataset, labels, weights, papa)
% ---------------------------------------------------------------------------------
% Arguments:
% tree_node - object of tree_node_w class
% dataset - training data
% labels - training labels
% weights - weights of training data
% papa - parent node (the one being split)
% Return:
% tree_node_left - left node (result of splitting)
% tree_node_right - right node (result of splitting)
% split_error - error of splitting

function [tree_node_left, tree_node_right, split_error] = do_learn_nu(tree_node, dataset, labels, weights, papa)

tree_node_left = tree_node;
tree_node_right = tree_node;

if(nargin > 4) 输入大于4
tree_node_left.parent = papa; PAPA代表当前节点
tree_node_right.parent = papa;
end

Distr = weights;

%[trainpat, traintarg] = get_train( dataset);----------------这行代码在整个工程中没有用。默认屏蔽了。
trainpat = dataset;
traintarg = labels;

tr_size = size(trainpat, 2);

T_MIN = zeros(3,size(trainpat,1));
d_min = 1;
d_max = size(trainpat,1);

for d = d_min : d_max;

[DS, IX] = sort(trainpat(d,:));

TS = traintarg(IX);
DiS = Distr(IX);

lDS = length(DS);

vPos = 0 * TS;
vNeg = vPos;

i = 1;
j = 1;

while i <= lDS
k = 0;
while i + k <= lDS && DS(i) == DS(i+k)
if(TS(i+k) > 0)
vPos(j) = vPos(j) + DiS(i+k);
else
vNeg(j) = vNeg(j) + DiS(i+k);
end
k = k + 1;
end
i = i + k;
j = j + 1;
end

vNeg = vNeg(1:j-1);
vPos = vPos(1:j-1);

Error = zeros(1, j - 1);

InvError = Error;

IPos = vPos;
INeg = vNeg;

for i = 2 : length(IPos)
IPos(i) = IPos(i-1) + vPos(i);
INeg(i) = INeg(i-1) + vNeg(i);
end

Ntot = INeg(end);
Ptot = IPos(end);

for i = 1 : j - 1
Error(i) = IPos(i) + Ntot - INeg(i);
InvError(i) = INeg(i) + Ptot - IPos(i);
end

idx_of_err_min = find(Error == min(Error));
if(length(idx_of_err_min) < 1)
idx_of_err_min = 1;
end

if(length(idx_of_err_min) <1)
idx_of_err_min = idx_of_err_min;
end
idx_of_err_min = idx_of_err_min(1);

idx_of_inv_err_min = find(InvError == min(InvError));

if(length(idx_of_inv_err_min) < 1)
idx_of_inv_err_min = 1;
end

idx_of_inv_err_min = idx_of_inv_err_min(1);

if(Error(idx_of_err_min) < InvError(idx_of_inv_err_min))
T_MIN(1,d) = Error(idx_of_err_min);
T_MIN(2,d) = idx_of_err_min;
T_MIN(3,d) = -1;
else
T_MIN(1,d) = InvError(idx_of_inv_err_min);
T_MIN(2,d) = idx_of_inv_err_min;
T_MIN(3,d) = 1;
end

end

dim = [];

best_dim = find(T_MIN(1,:) == min(T_MIN(1,:)));

dim = best_dim(1);

tree_node_left.dim = dim;
tree_node_right.dim = dim;

TDS = sort(trainpat(dim,:));

lDS = length(TDS);

DS = TDS * 0;

i = 1;
j = 1;

while i <= lDS
k = 0;
while i + k <= lDS && TDS(i) == TDS(i+k)
DS(j) = TDS(i);
k = k + 1;
end
i = i + k;
j = j + 1;
end

DS = DS(1:j-1);

split = (DS(T_MIN(2,dim)) + DS(min(T_MIN(2,dim) + 1, length(DS)))) / 2;

split_error = T_MIN(1,dim);

tree_node_left.right_constrain = split;
tree_node_right.left_constrain = split;

function [i,t] = weakLearner(distribution,train,label)
%disp('run weakLearner');
for tt = unique(train)%1:(16*256-1)
error(tt)=(label .* distribution) * ((train(:,floor(tt/16)+1)>=16*(mod(tt,16)+1)));
end
[val,tt]=max(abs(error-0.5));

i=floor(tt/16)+1;
t=16*(mod(tt,16)+1);
return;

calc_output.m

% This file is part of GML Matlab Toolbox
% For conditions of distribution and use, see the accompanying License.txt file.
%
% calc_output Implements classification of input by a classification tree node
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
% y = calc_output(tree_node, XData)
% ---------------------------------------------------------------------------------
% Arguments:
% tree_node - classification tree node
% XData - data, that will be classified
% Return:
% y - +1, if XData belongs to tree node, -1 otherwise (y is a vector)

%这个文件是GML MATLAB工具箱的一部分
%的分布及使用条件，见所附license.txt文件。
%
% calc_output通过分类树的节点进行分类输入
% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%
% y = calc_output（tree_node，XDATA）
% ---------------------------------------------------------------------------------
分论点：
% tree_node分类树的节点
% XDATA数据，将分类
回报率%：
% y + 1，如果数据属于树节点，1否则（Y是一个向量）

function y = calc_output(tree_node, XData)
y = XData(tree_node.dim, :) * 0 + 1;

for i = 1 : length(tree_node.parent)
y = y .* calc_output(tree_node.parent, XData);
end

if( length(tree_node.right_constrain) > 0)
y = y .* ((XData(tree_node.dim, :) < tree_node.right_constrain));
end
if( length(tree_node.left_constrain) > 0)
y = y .* ((XData(tree_node.dim, :) > tree_node.left_constrain));
end

zouyu409709312

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
对“视觉机器学习20讲配套仿真代码”的研究心得---Adaboost（二）

tree_node_w.m% This file is part of GML Matlab Toolbox% For conditions of distribution and use, see the accompanying License.txt file.%% tree_node_w Implements the constructor for tr
复制链接

扫一扫