ITEC3040数据分析

该作业涉及ITEC3040课程的数据分析,要求修改基本决策树算法考虑每个数据元组的计数,并使用新算法构建决策树。此外,需用给定数据构造NaiveBayes分类器并进行分类。还要求使用Manhattan、Euclidean和Supremum三种距离度量方法,基于三近邻规则对特定数据点进行分类。
摘要由CSDN通过智能技术生成


ITEC3040 Assignment 2 York University
ITEC3040 Introduction to Data Analytics
Assignment ] 2
Due: August 2, 11:55pm, 2023
Submission Instructions:
• This is individual assignment.
• Use eClass to submit your work.
• At the top of the each file introduce your name and student number.
• You may use software (for example, R, SAS, MATLAB and Python). No Excel allowed.
1. Show ALL your work!!!
2. Submit ALL your program(s) along with your solutions(including comments, results and
graphs).
• Evaluation is based on the work you submitted.
1. Textbook, page 387, 8.7
(a) How would you modify the basic decision tree algorithm to take into consideration the count
of each generalized data tuple (i.e., of each row entry)?
(b) Use your algorithm to construct a decision tree from the given data.
(c) Given a data tuple having the values “systems”, “26. . . 30”, and “46–50K” for the attributes
department, age, and salary, respectively, what would decision tree classification of the status
for the tuple be?
(d) Construct the Na¨ıve Bayesian Classifier and redo part c).
2. Suppose you are given the following data set, in which attribute A through attribute C predict
Class attribute.
i
ITEC3040 Assignment 2 York University
A B C Class
30 35 6 YES
22 50 4 NO
34 200 2 NO
59 170 7 YES
25 40 2 YES
63 150 3 NO
77 105 8 YES
34 200 2 NO
59 170 7 YES
12 207 9 YES
55 181 5 NO
Using Manhattan distance, Euclidean distance and Supremum distance, classify the data point
(A = 37, B = 95, C = 3) according to its 3−nearest neighbors
 WX:codehelp.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值