Coursera 机器学习 Week6 System Design 课后习题

本文介绍了Coursera机器学习课程第六周关于系统设计的课后习题解答,涉及垃圾邮件分类、模型性能评估、阈值调整、偏斜数据集处理等知识点。通过对习题的解析,加深对机器学习中召回率、精度、阈值影响等概念的理解。
摘要由CSDN通过智能技术生成

做了好几遍 都没有做到全对 下面来写一下我对这章课后习题的理解,算是加深一点印象吧。


1.

You are working on a spam classification system using regularized logistic regression. "Spam" is a positive class (y = 1) and "not spam" is the negative class (y = 0). You have trained your classifier and there are m = 1000 examples in the cross-validation set. The chart of predicted class vs. actual class is:

  Actual Class: 1 Actual Class: 0
Predicted Class: 1 85 890
Predicted Class: 0 15 10

For reference:

  • Accuracy = (true positives + true negatives) / (total examples)
  • Precision = (true positives) / (true positives + false positives)
  • Recall = (true positives) / (true positives + false negatives)
  • F1 score = (2 * precision * recall) / (precision + recall)

What is the classifier's recall (as a value from 0 to 1)?

这边要求计算预测结果的召回率recall

根据公式 recall = True Positive / (True Positive + True Negative) = 85/ (85+15)=0.85

如果是计算 precision = True Positive / (True Positive + False Positive) = 85/(85+890)=0.09


2.

Suppose a massive dataset is available for training a learning algorithm. Training on a lot of data is likely to give good performance when two of the following conditions hold true.

Which are the two?

We train a learning algorithm with a

large number of parameters (that is able to

learn/represent fairly complex functions).

We train a learning algorithm with a

small number of parameters (that is thus unlikely to

overfit).

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值