matlab k fold,求教:m-fold cross validation 在matlab下如何实现？

最新推荐文章于 2024-04-11 00:30:00 发布

喧闹罒蛰伏

最新推荐文章于 2024-04-11 00:30:00 发布

阅读量158

点赞数

文章标签： matlab k fold

该楼层疑似违规已被系统折叠隐藏此楼查看此楼

crossvalind - Generate cross-validation indices

Syntax

Indices = crossvalind('Kfold', N, K)

[Train, Test] = crossvalind('HoldOut', N, P)

[Train, Test] = crossvalind('LeaveMOut', N, M)

[Train, Test] = crossvalind('Resubstitution', N, [P,Q])

[...] = crossvalind(Method, Group, ...)

[...] = crossvalind(Method, Group, ..., 'Classes', C)

[...] = crossvalind(Method, Group, ..., 'Min', MinValue)

Description

Indices = crossvalind('Kfold', N, K) returns randomly generated indices for a K-fold cross-validation of N observations. Indices contains equal (or approximately equal) proportions of the integers 1 through K that define a partition of the N observations into K disjoint subsets. Repeated calls return different randomly generated partitions. K defaults to 5 when omitted. In K-fold cross-validation, K-1 folds are used for training and the last fold is used for evaluation. This process is repeated K times, leaving one different fold for evaluation each time.

[Train, Test] = crossvalind('HoldOut', N, P) returns logical index vectors for cross-validation of N observations by randomly selecting P*N (approximately) observations to hold out for the evaluation set. P must be a scalar between 0 and 1. P defaults to 0.5 when omitted, corresponding to holding 50% out. Using holdout cross-validation within a loop is similar to K-fold cross-validation one time outside the loop, except that non-disjointed subsets are assigned to each evaluation.

[Train, Test] = crossvalind('LeaveMOut', N, M), where M is an integer, returns logical index vectors for cross-validation of N observations by randomly selecting M of the observations to hold out for the evaluation set. M defaults to 1 when omitted. Using LeaveMOut cross-validation within a loop does not guarantee disjointed evaluation sets. Use K-fold instead.

[Train, Test] = crossvalind('Resubstitution', N, [P,Q]) returns logical index vectors of indices for cross-validation of N observations by randomly selecting P*N observations for the evaluation set and Q*N observations for training. Sets are selected in order to minimize the number of observations that are used in both sets. P and Q are scalars between 0 and 1. Q=1-P corresponds to holding out (100*P)%, while P=Q=1 corresponds to full resubstitution. [P,Q] defaults to [1,1] when omitted.

[...] = crossvalind(Method, Group, ...) takes the group structure of the data into account. Group is a grouping vector that defines the class for each observation. Group can be a numeric vector, a string array, or a cell array of strings. The partition of the groups depends on the type of cross-validation: For K-fold, each group is divided into K subsets, approximately equal in size. For all others, approximately equal numbers of observations from each group are selected for the evaluation set. In both cases the training set contains at least one observation from each group.

[...] = crossvalind(Method, Group, ..., 'Classes', C) restricts the observations to only those values specified in C. C can be a numeric vector, a string array, or a cell array of strings, but it is of the same form as Group. If one output argument is specified, it contains the value 0 for observations belonging to excluded classes. If two output arguments are specified, both will contain the logical value false for observations belonging to excluded classes.

[...] = crossvalind(Method, Group, ..., 'Min', MinValue) sets the minimum number of observations that each group has in the training set. Min defaults to 1. Setting a large value for Min can help to balance the training groups, but adds partial resubstitution when there are not enough observations. You cannot set Min when using K-fold cross-validation.

Examples

Create a 10-fold cross-validation to compute classification error.

load fisheriris

indices = crossvalind('Kfold',species,10);

cp = classperf(species);

for i = 1:10

test = (indices == i); train = ~test;

class = classify(meas(test,:),meas(train,:),species(train,:));

classperf(cp,class,test)

end

cp.ErrorRate

Approximate a leave-one-out prediction error estimate.

load carbig

x = Displacement; y = Acceleration;

N = length(x);

sse = 0;

for i = 1:100

[train,test] = crossvalind('LeaveMOut',N,1);

yhat = polyval(polyfit(x(train),y(train),2),x(test));

sse = sse + sum((yhat - y(test)).^2);

end

CVerr = sse / 100

Divide cancer data 60/40 without using the 'Benign' observations. Assume groups are the true labels of the observations.

labels = {'Cancer','Benign','Control'};

groups = labels(ceil(rand(100,1)*3));

[train,test] = crossvalind('holdout',groups,0.6,'classes',...

{'Control','Cancer'});

sum(test) % Total groups allocated for testing

sum(train) % Total groups allocated for training