Andrew Ng Machine Learning 第六周

前言

网易云课堂(双语字幕,不卡):https://study.163.com/course/courseMain.htm?courseId=1004570029
Coursera:https://www.coursera.org/learn/machine-learning
本人初学者,先在网易云课堂上看网课,再去Coursera上做作业,开博客以记录,文章中引用图片皆为课程中所截。

应用机器学习的建议

1.决定下一步做什么

在这里插入图片描述
Tips: 讲训练集 随机 分割成两份,一份负责计算出理想的J(θ)然后最小化,再用得出的结果同样方法计算验证Jtest(θ)

2.模型选择和训练、验证、测试集

在这里插入图片描述
Tips: 决定hθ(x)的次数
在这里插入图片描述
Tips: 将训练集随机分成三部分,先计算训练集上最初始的J(θ),将J(θ)得出来的θ代入交叉集即图中第二部分来验证J(θ),来定下多项式的次数,最后在测试集上测试它的泛化能力

3.诊断偏差与方差

在这里插入图片描述
Tips: 简单理解上就是欠拟合或者参数不够即为高偏差,过拟合即为高方差
在这里插入图片描述
在这里插入图片描述

4.正则化和偏差、方差

在这里插入图片描述
在这里插入图片描述
Tips: 在模型选择那一步上加上了正则项

5.学习曲线

在这里插入图片描述
Tips: 训练集小的时候,能简单符合要求,当训练集越来越大的时候,代价函数将会越来越大之后变于平稳,因为hθ(x)已无法更好的满足要求了
在这里插入图片描述
Tips: 训练集小的时候,不能满足泛化要求,越来越多的训练集时,同上hθ(x)已无法更好的满足要求了,显然,两个代价函数都很大,但是增加训练集已经基本不会变化了
在这里插入图片描述

6.决定接下来做什么

在这里插入图片描述
Tips: 高偏差情况,可以理解为λ过大或者特征参数过少而欠拟合,此时,减小λ,增加特征参数。而高方差情况,可以理解为λ过小或者特征参数过多而过拟合,此时,增大λ,增加训练集,减少特征参数。

7.不对称分类的误差评估

在这里插入图片描述
Tips: 两个新定义,Precision查准率即为所有预测为1的情况正确的占比,即为(真正为1)/(预测为1的总数),而Recall召回率即为所有真正为1的情况中预测正确的占比,即为(预测为1)/(真正为1的总数)

8.查准率和召回率的权衡

在这里插入图片描述
在这里插入图片描述
Tips: 通过改变logistic regression中预测1和0的方式,之前情况均为hθ(x)>0.5即为1,此处的0.5即为threshold,增大threshould,使预测成功率增加,即增加了precision,同理。
在这里插入图片描述
Tips: 依靠对比上式来选择threshold,越大越好

题目

1.Question 1

You train a learning algorithm, and find that it has unacceptably high error on the test set. You plot the learning curve, and obtain the figure below. Is the algorithm suffering from high bias, high variance, or neither?
在这里插入图片描述
解答:C

2.Question 2

Suppose you have implemented regularized logistic regressionto classify what object is in an image (i.e., to do objectrecognition). However, when you test your hypothesis on a newset of images, you find that it makes unacceptably large errors with its predictions on the new images. However, your hypothesis performs well (has low error) on the training set. Which of the following are promising steps to take? Check all that apply.
在这里插入图片描述
解答:B

3.Question 3

Suppose you have implemented regularized logistic regression

to predict what items customers will purchase on a web

shopping site. However, when you test your hypothesis on a new

set of customers, you find that it makes unacceptably large

errors in its predictions. Furthermore, the hypothesis

performs poorly on the training set. Which of the

following might be promising steps to take? Check all that

apply.
在这里插入图片描述
解答:AC

4.Question 4

Which of the following statements are true? Check all that apply.
在这里插入图片描述
解答:BC

5.Question 5

Which of the following statements are true? Check all that apply.
在这里插入图片描述
解答:ACD

6.Question 6

You are working on a spam classification system using regularized logistic regression. “Spam” is a positive class (y = 1) and “not spam” is the negative class (y = 0). You have trained your classifier and there are m = 1000 examples in the cross-validation set. The chart of predicted class vs. actual class is:
在这里插入图片描述
For reference:
Accuracy = (true positives + true negatives) / (total examples)
Precision = (true positives) / (true positives + false positives)
Recall = (true positives) / (true positives + false negatives)
F1score = (2 * precision * recall) / (precision + recall)
What is the classifier’s precision (as a value from 0 to 1)?
Enter your answer in the box below. If necessary, provide at least two values after the decimal point.

解答:Precision=(85)/(85+890)=0.087

7.Question 7

Suppose a massive dataset is available for training a learning algorithm. Training on a lot of data is likely to give good performance when two of the following conditions hold true.

Which are the two?
在这里插入图片描述
解答:AD
要能使用大数据:1.模型足够复杂可以表示复杂函数 2.数据有效,本身有规律可循

8.Question 8

Suppose you have trained a logistic regression classifier which is outputing hθ(x).Currently, you predict 1 if h θ(x)≥threshold, and predict 0 if h θ(x)<threshold, where currently the threshold is set to 0.5.
Suppose you decrease the threshold to 0.3. Which of the following are true? Check all that apply.
在这里插入图片描述
解答:D

9.Question 9

Suppose you are working on a spam classifier, where spam emails are positive examples (y=1y=1) and non-spam emails are

negative examples (y=0y=0). You have a training set of emails

in which 99% of the emails are non-spam and the other 1% is

spam. Which of the following statements are true? Check all

that apply.
在这里插入图片描述
解答:ABD

10.Question 10

Which of the following statements are true? Check all that apply.
在这里插入图片描述
解答:BC

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值