【Machine Learning】【Andrew Ng】- Quiz1(Week 9)

1、For which of the following problems would anomaly detection be a suitable
algorithm?
A. In a computer chip fabrication plant, identify microchips that might be defective.
B. From a large set of hospital patient records, predict which patients have a particular disease (say, the u).
C. Given data from credit card transactions, classify each transaction according to type of purchase (for example: food, transportation, clothing).
D. From a large set of primary care patient records, identify individuals who might have unusual health conditions.
答案:AD。anomaly detection 主要用于positive examples 远远多于negative examples的情况。

2、Suppose you have trained an anomaly detection system that flags anomalies when p(x) is less than ε , and you find on the cross-validation set that it has too many false positives (flagging too many things as anomalies). What should you do?
A. Decrease ε
B. Increase ε
答案:A。将太多正常的样本标记为不正常,所以阈值太大,需要减小。

3、Suppose you are developing an anomaly detection system to catch manufacturing defects in airplane engines. You model uses
这里写图片描述
You have two features x1= vibration intensity, and x2= heat generated.
Both x1 and x2 take on values between 0 and 1 (and are strictly greater than
0), and for most “normal” engines you expect that x1≈x2. One of the
suspected anomalies is that a awed
engine may vibrate very intensely even
without generating much heat (large x1 , small x2), even though the
particular values of x1 and x2 may not fall outside their typical ranges of
values. What additional feature x3 should you create to capture these types
of anomalies:
A. x3 = x1/x2
B. x3 = x1+x2
C. x3 = x1^2x2
D. x3 = x1
x2
答案:A。x1和x2都是0~1之间,并且x1和x2相差很大,所以用比值可以很好地分开异常值与正常值。

4、Which of the following are true? Check all that apply
A. If you are developing an anomaly detection system, there is no way to make use of labeled data to improve your system.
B. If you have a large labeled training set with many positive examples and many negative examples, the anomaly detection
algorithm will likely perform just as well as a supervised learning algorithm such as an SVM.
C. If you do not have any labeled data (or if all your data has label y=0), then is is still possible to learn p(x), but it may be harder
to evaluate the system or choose a good value of ϵ.
D. When choosing features for an anomaly detection system, it is a good idea to look for features that take on unusually large or
small values for (mainly the) anomalous examples.
答案:CD
A,错误。在异常检测系统里,是需要用到labeled data的,所以当然可以提高性能
B 错误。(见评论)
C,正确,学习还是可以的,只是效果不一定好。
D,正确

5、You have a 1-D dataset {x(1),…,x(m)} and you want to detect outliers in the dataset. You first plot the dataset and it looks like this:
这里写图片描述
Suppose you fit the gaussian distribution parameters mu1 and sigma1^2 to this
dataset. Which of the following values for mu1 and sigma1^2 might you get?
A. mu1 =-3, sigma1^2 = 4
B. mu1 =-6, sigma1^2 = 4
A. mu1 =-3, sigma1^2 = 2
A. mu1 =-6, sigma1^2 = 2
答案:A
mu看对称轴,mu-sigma到mu+sigma之间的概率和大概为0.7,所以根据目测,这里sigma=2比较合适。

评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值