Contents
Week1
1 Introduction to ML strategy
1.1 Why ML Strategy
- 有很多对模型优化的方法,这节课就是讲如何分析和选择优化的方向
1.2 Orthogonalization
- 每一步选择调不一样的超参,从而实现orthogonal
- 通常不调整early stopping,因为它会影响两个步骤,所以是一个less orthogonal 的超参
2 Single up your goal
2.1 Single Number evaluation metric
- 只用一个指标来评估能够更有效有快速
2.2 Satisficing and Optimizing Metric
- 优化一个metric,其他的作为限制条件
2.3 Train/Dev/Test Distributions
- 让dev和test来自同一个分布
- Choose a dev set and test set to reflect data you expect to get in the future and consider important to do well on
2.4 Size of the Dev and Test Sets
- Set your dev set to be big enough to evaluate different hyper-parameters and pick up the best one.
- Set your test set to be big enough to give high confidence in the overall performance of your system
2.5 When to Change Dev/Test Sets and Metrics?
- Even if you cannot find a perfect metric + dev/test set, just set something up quickly and use that to drive the speed of your team iterating. And later you find out that it wasn’t a good one, you have better idea, change it at that time, it’s perfectly okay.
- We should against running for too long without any evaluation metric and dev set up because that can slow down the efficiency of what your team can iterate and improve your algorithm.
3 Comparing to human-level performance
3.1 Why Human-level Performance?
- Once the model surpass the human-level performance, then these three tactics are harder to apply.
3.2 Avoidable Bias
- Human-level error as a proxy for Bayes error
- The difference between Bayes error or approximation of Bayes error and the training error is the avoidable bias.
- You can’t actually better than Bayes error unless you’re overfitting
3.3 Understanding Human-level Performance
3.4 Surpassing Human-level Performance
- Not natural perception(Human is good at natural perception asks)
3.5 Improving your Model performance
- Fewer layers could result in lower accuracy that is not offset by the lower training time.
Week2
1 Error Analysis
1.1 Carrying Out Error Analysis
- 分析预测错误的样本,将不同的错误用这种表格进行分类,这样知道要在什么方向做提高,并且知道提高的上线是多少
1.2 Cleaning Up Incorrectly Labeled Data
- DL algorithms are quite robust to random errors in the training set (systematic error)
- Goal of dev set is to help you select between two classifiers A&B 上面那个例子,显然incorrect label 看你会影响最后对分类器的选择
1.3 Build your First System Quickly, then Iterate
2 Mismatched Training and Dev/Test Set
2.1 Training and Testing on Different Distributions
2.2 Bias and Variance with Mismatched Data Distributions
- 右边的情况可能出现,比如我们为了训练模型将training data调得非常难,远比dev set和test set难,那么这种情况就很可能出现
2.3 Addressing Data Mismatch
- 如果我们的模型存在data mismatch problem,那么我们最好先做error analysis,看training set和dev set, gain insight into how these two distributions of data might differ. 然后我们再看能不能找出什么方法让training set看起来更像dev set,其中一种方法就是artificial data synthesis
- 如果只有一个小时的汽车噪音,但又10000小时的人声,那么我们人工生成的数据,很可能会让模型对这一小时的噪音过拟合
- 使用人工合成数据需要注意的问题就是,我们有可能只从所有可能性的空间中选取了很小的一部分模拟数据
3 Learning from Multiple Tasks
3.1 Transfer Learning
- 之前用其他数据训练模型叫做 pre-initialize or pre-training the weights of neural network,如果再根据需要用其他的数据集继续训练这个模型叫做 fine-tuning
- Rule of thumb: 如果是小数据集,那么只retrain网络的最后一层或两层;如果有很多数据,那么我们可以retrain整个网络
3.2 Multi-task Learning
- Multi-task Learning 就是让一个模型预测四个目标(相对应的是四个模型预测四个目标)
- 如果神经网络一些早期特征,在识别不同物体时都会用到,那么训练一个神经网络做四件事情会比训练四个完全独立的神经网络分别做四件事性能要更好
- 损失函数还是logistic loss,需要对所有目标进行求和(例子中就是四个)
- 如果数据的标签不完整,也是可以用来训练模型的,可以将?问号对应的样本在求loss的时候忽略掉(如图)
- Transfer Learning is more often than multi-task learning
- 理论上,multi-task learning 会降低性能的唯一情况就是你的神经网络还不够大
4 End-to-end Deep Learning
4.1 What is End-to-end Deep Learning?
4.2 Whether to use End-to-end Deep Learning