机器学习中的OOF

[1]中回答如下:

OF simply stands for "Out-of-fold" and refers to a step in the learning process when using k-fold validation in which the predictions from each set of folds are grouped together into one group of 1000 predictions. These predictions are now "out-of-the-folds" and thus error can be calculated on these to get a good measure of how good your model is.

In terms of learning more about it, there's really not a ton more to it than that, and it certainly isn't its own technique to learning or anything. If you have a follow up question that is small, please leave a comment and I will try and update my answer to include this.

EDIT: While ambling around the inter-webs I stumbled upon this[2] relatively similar question from Cross-Validated (with a slightly more detailed answer), perhaps it will add some intuition if you are still confused.

[2]中回答如下:

When training on each fold (90%) of the data, you will then predict on the remaining 10%. With this 10% you will compute an error metric (RMSE, for example). This leaves you with: 10 values for RMSE, and 10 sets of corresponding predictions. There are 2 things to do this these results:

  1. Inspect the mean and standard deviation of your 10 RMSE values. k-fold takes random partitions of your data, and the error on each fold should not vary too greatly. If it does, your model (and its features, hyper-parameters etc.) cannot be expected to yield stable predictions on a test set.

  2. Aggregate your 10 sets of predictions into 1 set of predictions. For example, if your training set contains 1,000 data points, you will have 10 sets of 100 predictions (10*100 = 1000). When you stack these into 1 vector, you are now left with 1000 predictions: 1 for every observation in your original training set. These are called out-of-folds predictions. With these, you can compute the RMSE for your whole training set in one go, as rmse = compute_rmse(oof_predictions, y_train). This is the likely the cleanest way to evaluate the final predictor.

 

一句话就是,进行10折验证的时候,假如训练集1000条:

十折cv,10个模型,每个模型都是由900条训练集训练而成,对剩下的100条进行预测,10个模型都对各自剩下的100条进行预测,这个就叫做OOF predictions

[1]https://stackoverflow.com/questions/52396191/what-is-oof-approach-in-machine-learning

[2]https://stats.stackexchange.com/questions/161491/how-to-evaluate-the-final-model-after-k-fold-cross-validation

评论 2 您还未登录,请先 登录 后发表或查看评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
©️2022 CSDN 皮肤主题:代码科技 设计师:Amelia_0503 返回首页

打赏作者

微电子学与固体电子学-俞驰

你的鼓励将是我创作的最大动力

¥2 ¥4 ¥6 ¥10 ¥20
输入1-500的整数
余额支付 (余额:-- )
扫码支付
扫码支付:¥2
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值