Coursera吴恩达机器学习week3笔记

Evaluating learning algorithm

Evaluating a Hypothesis

Once we have done some trouble shooting for errors in our predictions by:

  • Getting more training examples:Fixes high variance
  • Trying smaller sets of features:Fixes high variance
  • Trying additional features:Fixes high bias
  • Trying polynomial features:Fixes high bias
  • Increasing λ:Fixes high variance
  • decreasing λ:Fixes high bias

可能有的公式针对训练集已经有很低的错误了,但是依然不够准确,因为这是过拟合的情况。所以为了分析假说公式,我们把数据集分为两类:训练集(70%)和测试集(30%)

在这里插入图片描述

Model Selection and Train/Validation/Test Sets

One way to break down our dataset into the three sets is:

  • Training set: 60%
  • Cross validation set: 20%
  • Test set: 20%

We can now calculate three separate error values for the three different sets using the following method:

  1. Optimize the parameters in Θ using the training set for each polynomial degree.
  2. Find the polynomial degree d with the least error using the cross validation set.
  3. Estimate the generalization error using the test set with J , (d = theta from polynomial with lower error);

This way, the degree of the polynomial d has not been trained using the test set.

Bias vs Variance

Dignosing bisa vs variance

在这里插入图片描述

Regularization and Bias/Variance

在这里插入图片描述

Learning Curve

在这里插入图片描述

在这里插入图片描述

Diagnosing Neural Networks

  • A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
  • A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

Building a Spam Classifier

Prioritizing What to Work On

  • Collect lots of data (for example “honeypot” project but doesn’t always work)
  • Develop sophisticated features (for example: using email header data in spam emails)
  • Develop algorithms to process your input in different ways (recognizing misspellings in spam).

在这里插入图片描述

Error Analysis

  • Start with a simple algorithm, implement it quickly, and test it early on your cross validation data.
  • Plot learning curves to decide if more data, more features, etc. are likely to help.
  • Manually examine the errors on examples in the cross validation set and try to spot a trend where most of the errors were made.

Handling Skewed Data

在这里插入图片描述

F1 Score: 2*P*R/(P+R)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值