train, validation and test
通常分为train和test,在train中分出一份validation,第二步骤重复n次(通常取10)。
Generally splits are done like this:
a) Train
b) Test
Generally, the train data is then split in n parts. n−1 of them are used for training and remaining 1 is used for validation. And, this process is repeated until all the n parts become validation sets once.
out of sample and in sample
\ | out of sample data | in sample data |
---|---|---|
train | no | yes |
validation | no | yes |
test | yes | no |
in sample testing <-> purpose: high train accuracy<