We have already seen that, in the maximum likelihood approach, the performance on the training set is not a good indicator of predictive performance on un-seen data due to the problem of over-fitting. If data is plentiful, then one approach is simply to use some of the available data to train a range of models, or a given model with a range of values for its complexity parameters, and then to compare them on independent data, sometimes called a validation set,and select the one having the best predictive performance.
model selection
最新推荐文章于 2024-01-09 23:23:11 发布