Class3-Week1 ML Strategy1

Orthogonalization

Orthogonalization means you should ensure that adjusting “one parameter” affects only the spectific aspect you want to optimize of your model.

Chain of assumptions in ML

  1. Fit training set well on cost function
    • Large network
    • Optimization algorithm
  2. Fit dev set well on cost function
    • Regularization
    • Bigger training set
  3. Fit test set well on cost function
    • Bigger dev set
  4. Performs well in real world
    • Cost function

Setting up your Goal

Precision and Recall

GT/PredictTrueFalse
TrueTrue Positive(TP)False Negative(FN)
FalseFalse Positive(FP)True Negative(TN)

P r e c i s i o n = T P T P + F P Precision = \frac{TP}{TP+FP} Precision=TP+FPTP

R e c a l l = T P T P + F N Recall = \frac{TP}{TP+FN} Recall=TP+FNTP

An example is: Precision = “What percentage of the watermelons picked out is a good melon”, Recall = “What percentage of all good melons are picked out”. The recall rate and the precision rate are a pair of contradictory measures. Generally speaking, when the precision is high, the recall rate tends to be low; when the recall rate is high, the precision is often low.

Single Number Evaluation Metric

Whether we’re tuning hyperparameters, or trying out different ideas for learning algorithms, or just trying out different options for building machine learning system. The progress will be much faster if we have a single real number evaluation metric that lets us quickly tell if the new thing we just tried is working better or worse than the last idea. In this case, we can use F1-score as our evaluation metric:

F 1 = 2 1 P + 1 R F_{1} = \frac{2}{\frac{1}{P}+\frac{1}{R}} F1=P1+R12

Satisficing and Optimizing Metric

If we have N metric that you care about it’s sometimes reasonable to pick one of them to be optimizing. So you want to do as well as is possible on that one. And then N - 1 to be satisficing, meaning that so long as they reach some threshold such as running times faster than 100 milliseconds, but so long as they reach some threshold, you don’t care how much better it is in that threshold, but they have to reach that threshold.


Train/Dev/Test Set Distributions

By setting the dev set and the test set to the same distribution, you’re really aiming at whatever target you hope machine learning model will hit. The way you choose your training set will affect how fast you can actually hit that target.
What I recommand for setting up a dev set and test set is, choose a dev set and test set to reflect data we expect to get in future and consider important to do well on. And, in particular, the dev set and the test set here, should come from the same distribution.


Improving your Model performance

There are two fundamental assumptions of supervised learning:

  • You can fit the training set pretty well.
  • The training set performance generalizes pretty well to the dev/test set.

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值