Andrew Ng Machine Learning 第六周

最新推荐文章于 2022-08-03 20:07:39 发布

未知丶丶

最新推荐文章于 2022-08-03 20:07:39 发布

阅读量1.2k

点赞数 1

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/qq_43310834/article/details/85015696

版权

机器学习专栏收录该内容

10 篇文章 0 订阅

订阅专栏

前言

网易云课堂（双语字幕，不卡）：https://study.163.com/course/courseMain.htm?courseId=1004570029
Coursera：https://www.coursera.org/learn/machine-learning
本人初学者，先在网易云课堂上看网课，再去Coursera上做作业，开博客以记录，文章中引用图片皆为课程中所截。

应用机器学习的建议

1.决定下一步做什么

在这里插入图片描述
Tips: 讲训练集随机分割成两份，一份负责计算出理想的J(θ)然后最小化，再用得出的结果同样方法计算验证J_test(θ)

2.模型选择和训练、验证、测试集

在这里插入图片描述
Tips: 决定h_θ(x)的次数

Tips: 将训练集随机分成三部分，先计算训练集上最初始的J(θ)，将J(θ)得出来的θ代入交叉集即图中第二部分来验证J(θ)，来定下多项式的次数，最后在测试集上测试它的泛化能力

3.诊断偏差与方差

在这里插入图片描述
Tips: 简单理解上就是欠拟合或者参数不够即为高偏差，过拟合即为高方差

4.正则化和偏差、方差

在这里插入图片描述

Tips: 在模型选择那一步上加上了正则项

5.学习曲线

在这里插入图片描述
Tips: 训练集小的时候，能简单符合要求，当训练集越来越大的时候，代价函数将会越来越大之后变于平稳，因为h_θ(x)已无法更好的满足要求了

Tips: 训练集小的时候，不能满足泛化要求，越来越多的训练集时，同上h_θ(x)已无法更好的满足要求了，显然，两个代价函数都很大，但是增加训练集已经基本不会变化了
在这里插入图片描述

6.决定接下来做什么

在这里插入图片描述
Tips: 高偏差情况，可以理解为λ过大或者特征参数过少而欠拟合，此时，减小λ，增加特征参数。而高方差情况，可以理解为λ过小或者特征参数过多而过拟合，此时，增大λ，增加训练集，减少特征参数。

7.不对称分类的误差评估

在这里插入图片描述
Tips: 两个新定义，Precision查准率即为所有预测为1的情况正确的占比，即为（真正为1）/（预测为1的总数），而Recall召回率即为所有真正为1的情况中预测正确的占比，即为（预测为1）/（真正为1的总数）

8.查准率和召回率的权衡

在这里插入图片描述

Tips: 通过改变logistic regression中预测1和0的方式，之前情况均为h_θ(x)>0.5即为1，此处的0.5即为threshold，增大threshould，使预测成功率增加，即增加了precision，同理。

Tips: 依靠对比上式来选择threshold，越大越好

题目

1.Question 1

You train a learning algorithm, and find that it has unacceptably high error on the test set. You plot the learning curve, and obtain the figure below. Is the algorithm suffering from high bias, high variance, or neither?
在这里插入图片描述
解答：C

2.Question 2

Suppose you have implemented regularized logistic regressionto classify what object is in an image (i.e., to do objectrecognition). However, when you test your hypothesis on a newset of images, you find that it makes unacceptably large errors with its predictions on the new images. However, your hypothesis performs well (has low error) on the training set. Which of the following are promising steps to take? Check all that apply.
在这里插入图片描述
解答：B

3.Question 3

Suppose you have implemented regularized logistic regression

to predict what items customers will purchase on a web

shopping site. However, when you test your hypothesis on a new

set of customers, you find that it makes unacceptably large

errors in its predictions. Furthermore, the hypothesis

performs poorly on the training set. Which of the

following might be promising steps to take? Check all that

apply.
在这里插入图片描述
解答：AC

4.Question 4

Which of the following statements are true? Check all that apply.
在这里插入图片描述
解答：BC

5.Question 5

Which of the following statements are true? Check all that apply.
在这里插入图片描述
解答：ACD

6.Question 6

You are working on a spam classification system using regularized logistic regression. “Spam” is a positive class (y = 1) and “not spam” is the negative class (y = 0). You have trained your classifier and there are m = 1000 examples in the cross-validation set. The chart of predicted class vs. actual class is:
在这里插入图片描述
For reference:
Accuracy = (true positives + true negatives) / (total examples)
Precision = (true positives) / (true positives + false positives)
Recall = (true positives) / (true positives + false negatives)
F1score = (2 * precision * recall) / (precision + recall)
What is the classifier’s precision (as a value from 0 to 1)?
Enter your answer in the box below. If necessary, provide at least two values after the decimal point.

解答：Precision=(85)/(85+890)=0.087

7.Question 7

Suppose a massive dataset is available for training a learning algorithm. Training on a lot of data is likely to give good performance when two of the following conditions hold true.

Which are the two?
在这里插入图片描述
解答：AD
要能使用大数据：1.模型足够复杂可以表示复杂函数 2.数据有效，本身有规律可循

8.Question 8

Suppose you have trained a logistic regression classifier which is outputing hθ(x).Currently, you predict 1 if h θ(x)≥threshold, and predict 0 if h θ(x)<threshold, where currently the threshold is set to 0.5.
Suppose you decrease the threshold to 0.3. Which of the following are true? Check all that apply.
在这里插入图片描述
解答：D

9.Question 9

Suppose you are working on a spam classifier, where spam emails are positive examples (y=1y=1) and non-spam emails are

negative examples (y=0y=0). You have a training set of emails

in which 99% of the emails are non-spam and the other 1% is

spam. Which of the following statements are true? Check all

that apply.
在这里插入图片描述
解答：ABD

10.Question 10

Which of the following statements are true? Check all that apply.
在这里插入图片描述
解答：BC

未知丶丶

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Andrew Ng Machine Learning 第六周

Andrew Ng Machine Learning 第五周前言应用机器学习的建议1.决定下一步做什么2.模型选择和训练、验证、测试集3.诊断偏差与方差4.正则化和偏差、方差5.学习曲线6.决定接下来做什么前言网易云课堂（双语字幕，不卡）：https://study.163.com/course/courseMain.htm?courseId=1004570029Coursera：https...
复制链接

扫一扫