03-Data Resampling

Related image

1. Bootstrap

Draw a “bootstrap sample" by sampling n times with replacement from the sample.

The bootstrap estimates the variability of the sampling process and works well for estimating confidence intervals.

A confidence interval provides a range of values which is likely to contain the population parameter of interest.

ex. I have 95% confidence to believe that the mean of this parameter is in range(x1, x2)

Image result for confidence interval



2. Permutation

Concatenate two datasets A & B, randomly reset the indexes, then output new A and new B with no replacement.

Permutation tests test a specific null hypothesis of exchangeability.


3.Cross validation

Cross-validation removes one point at a time, then fits to the remaining points, then sees how well the removed point is fit.

Cross-validation is primarily a way of measuring the predictive performance of a statistical model.

Cross Validation is used to assess the predictive performance of the models and and to judge how they perform outside the sample to a new data set also known as test data
The motivation to use cross validation techniques is that when we fit a model, we are fitting it to a training dataset. Without cross validation we only have information on how does our model perform to our in-sample data. Ideally we would like to see how does the model perform when we have a new data in terms of accuracy of its predictions. In science, theories are judged by its predictive performance.  
There two types of cross validation you can perform: leave one out and k fold.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值