《matlab实现的ID3分类决策树算法》由会员分享,可在线阅读,更多相关《matlab实现的ID3分类决策树算法(4页珍藏版)》请在人人文库网上搜索。
1、function D = ID3(train_features, train_targets, params, region) % Classify using Quinlans ID3 algorithm% Inputs:% features - Train features% targets - Train targets% params - Number of bins for the data, Percentage of incorrectly assigned samples at a node% region - Decision region vector: -x x -yy 。
2、number_of_points% Outputs% D- Decision sufrace% ?DDe yNio id De yMNi, M = size(train_features);%Get parametersNbins, inc_node = process_params(params); inc_node = inc_node*M/100;%For the decision regionN = region(5); mx = ones(N,1) * linspace(region(1),region(2),N);%linspace(?e e ?卩 ?1?卩 ?a?e y)my =。
3、 linspace (region(3),region(4),N) * ones(1,N);flatxy = mx(:), my(:);%Preprocessingf, t, UW, m = PCA(train_features, train_targets, Ni, region);train_features = UW * (train_features - m*ones(1,M);flatxy = UW * (flatxy - m*o nes(1,N2);%First, bin the data and the decision region dataH, binned_features。
4、= high_histogram(train_features, Nbins, region);H, binned_xy = high_histogram(flatxy, region);%Build the tree recursively disp( Building tree ) tree = make_tree(binned_features, train_targets, inc_node, Nbins);%Make the decision region according to the tree disp( Building decision surface using the 。
5、tree targets = use_tree(b inn ed_xy, 1:NA2, tree, Nbins, unique(train_targets);D= reshape(targets,N,N);%ENDfunction targets = use_tree(features, indices, tree, Nbins, Uc)%Classify recursively using a treetargets = zeros(1,size(features,2);%size(features,2)res a ? a Deyif (size(features,1) = 1),%Only。
6、 one dimension left, so work on it for i = 1:Nbins,in = indices(find(features(indices)if isempty(in),if isfinite(tree.child(i),targets(in) = tree.child(i);elseNbins,a ?featu= i);training set%No data was found in the for this bin, so choose it randomallyn = 1 +floor(rand(1)*length(Uc); targets(in) = 。
7、Uc(n);endendendbreakend %This is not the last level of the tree, so:%First, find the dimension we are to work on dim = tree.split_dim;dims= find(ismember(1:size(features,1), dim);%And classify according to itfor i = 1:Nbins,in = indices(find(features(dim, indices)= i);targets = targets + use_tree(fe。
8、atures(dims, in, tree.child(i), Nbins, Uc);end %END use_treefunction tree = make_tree(features, targets, inc_node, Nbins)%Build a tree recursivelyNi, L = size(features);Uc = unique(targets);%When to stop: If the dimension is one or the of examples is smallif (Ni = 1) | (inc_node L),%Compute the chil。
9、dren non-recursively for i = 1:Nbins,:),numbertree.split_dim = 0; indices= find(features = i);if isempty(indices),if (length(unique(targets(indices)1), tree.child(i) = targets(indices(1);else H = hist(targets(indices), Uc);m, T = max(H); tree.child(i) = Uc(T); end else tree.child(i) = inf;endend bre。
10、ak end %Compute the nodes Ifor i = 1:Ni,= Uc(i) / L;Pnode(i) = length(find(targets endInode = -sum(Pnode.*log(Pnode)/log(2);%For each dimension, compute the gain ratio impurity delta_Ib = zeros(1, Ni);P = zeros(length(Uc), Nbins);for i = 1:Ni,for j = 1:length(Uc),for k = 1:Nbins,indices = find(targe。
11、ts = Uc(j) &(features(i,:) = k);P(j,k) = length(indices);endendPk= sum(P);P= P/L;Pk= Pk/sum(Pk);info = sum(-P.*log(eps+P)/log(2); delta_Ib(i) =(Inode-sum(Pk.*info)/-sum(Pk.*log(eps+Pk)/log(2);end %Find the dimension minimizing delta_Ib m, dim = max(delta_Ib);%Split along the dim dimension tree.split_dim = dim;dims= find(ismember(1:Ni, dim);for i = 1:Nbins,indices= find(features(dim, :) = i);tree.child(i) = make_tree(features(dims, indices), targets(indices), inc_node, Nbins); end。