吴恩达关于dev / test sets的形象解释

      dev set is also called development set or sometimes called hold out cross validation set. And  worflow in machine learning is that you try lots of ideas,traning different models on the  traning  sets,and then use the dev sets to evaluate the diferent ideas and pick one,and keep iterating to improve dev set performance until finally you have one cost that you've happy with that you then evaluate on your test sets

      Now, let's say ,by way of exmaple that you building a cat classifier.here are some regions,and if we set the dev and test as you see in the picture,it's a very bad idea,because your dev and test come from different distributions.


         dev set + your single real number evaluation metric like placing a target and telling your team where's you think is the bullseye you want to aim at.because once you established the dev set and metric is that the team can iterate very quickly,try different ideas,run experiments and very quickly ues the dev set and the metric to evaluate the classifiers and pick the best one. so machine learning teams are good at shooting different arrows into target and iterating to get closer and closer to hitting the bullseye. so doing well on your dev set and metric.

        And the probelm with how we set up the dev set and test sets,so you'll wasting  months of work on optimizing to the dev set.and is not giving good performance on test sets. so having dev set and test sets from different distributions is like setting a target having your team spend months trying closer and closer to the bullseye, only to realize after months of work that you'er going to move the target over right.

       so, to avoid this ,you should take all the data,randomly shuffles the data into dev sets and test sets ,and dev sets and test sets reallycome from the same distribution.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值