Class3-Week1 ML Strategy1

最新推荐文章于 2020-07-11 12:11:44 发布

zcx_language

最新推荐文章于 2020-07-11 12:11:44 发布

阅读量297

点赞数

分类专栏： Deep Learning

本文链接：https://blog.csdn.net/language_zcx/article/details/101036793

版权

Deep Learning 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

文章目录

Orthogonalization
- Chain of assumptions in ML
Setting up your Goal
Train/Dev/Test Set Distributions
Improving your Model performance

Orthogonalization

Orthogonalization means you should ensure that adjusting “one parameter” affects only the spectific aspect you want to optimize of your model.

Chain of assumptions in ML

Fit training set well on cost function
- Large network
- Optimization algorithm
Fit dev set well on cost function
- Regularization
- Bigger training set
Fit test set well on cost function
- Bigger dev set
Performs well in real world
- Cost function

Setting up your Goal

Precision and Recall

GT/Predict	True	False
True	True Positive(TP)	False Negative(FN)
False	False Positive(FP)	True Negative(TN)

$\frac{TP}{TP+FP}$

$\frac{TP}{TP+FN}$

An example is: Precision = “What percentage of the watermelons picked out is a good melon”, Recall = “What percentage of all good melons are picked out”. The recall rate and the precision rate are a pair of contradictory measures. Generally speaking, when the precision is high, the recall rate tends to be low; when the recall rate is high, the precision is often low.

Single Number Evaluation Metric

Whether we’re tuning hyperparameters, or trying out different ideas for learning algorithms, or just trying out different options for building machine learning system. The progress will be much faster if we have a single real number evaluation metric that lets us quickly tell if the new thing we just tried is working better or worse than the last idea. In this case, we can use F1-score as our evaluation metric:

$F_{1} = \frac{2}{\frac{1}{P}+\frac{1}{R}}$

Satisficing and Optimizing Metric

If we have N metric that you care about it’s sometimes reasonable to pick one of them to be optimizing. So you want to do as well as is possible on that one. And then N - 1 to be satisficing, meaning that so long as they reach some threshold such as running times faster than 100 milliseconds, but so long as they reach some threshold, you don’t care how much better it is in that threshold, but they have to reach that threshold.

Train/Dev/Test Set Distributions

By setting the dev set and the test set to the same distribution, you’re really aiming at whatever target you hope machine learning model will hit. The way you choose your training set will affect how fast you can actually hit that target.
What I recommand for setting up a dev set and test set is, choose a dev set and test set to reflect data we expect to get in future and consider important to do well on. And, in particular, the dev set and the test set here, should come from the same distribution.