COMP3032 – Machne Learning OnePython

Java Python COMP3032 – Machne Learning

Assignment One (20 marks)

Due date: 11:59 pm Monday, 16 September 2024

Main objective:

❼ This assignment is to apply supervised and unsupervised machine learning tech-niques.

❼ You will have an opportunity to employ various regression models for predictions and classifications, to utilize the cross-validation approach for model selection, and to perform. PCA for dimensionality reduction.

Task1: Systolic Pressure Prediction using different models(12 marks):

1. Dataset Description: Blood pressure dataset pressure.csv contains examples of sys-tolic pressures along with various features from different individuals.

2. Polynomial Regression:

(a) Create polynomial regression models using the whole dataset to predict systolic pressure using the ”WEIGHT” feature, for polynomial degrees ranging from 1 to 14.

(b) Perform. 10-fold cross-validation.

(c) Compute and display the mean RMSEs of the 10-fold cross-validation for each of the 14 polynomial degrees.

(d) Produce a cross-validation error plot showing the mean RMSE for polynomial degrees from 1 to 14.

3. Model selection:

(a) Select the best polynomial degree and briefly explain your choice.

(b) Print the intercept and coefficients of the selected model.

4. Multiple linear regression:

(a) Create a multiple linear regression model to predict systolic pressure using all the other relevant useful features in the dataset.

(b) Print the intercept and coefficients of the model.

(c) Perform. 10-fold cross-validation.

(d) Compute and display the mean RMSE for the 10-fold cross-validation.

5. Ridge regression:

(a) Build a ridge regression model for the multiple linear regression model created in item 4 with a regularization parameter α = 0.1.

(b) Print the intercept and coefficients of the model.

(c) Perform. 10-fold cross-validation.

(d) Compute and display the mean RMSE for the 10-fold cross-validation.

6. Model comparison:

(a) Select the best model among the three (polynomial regression, multiple linear regression, and ridge regression).

(b) Briefly explain the reasons behind your choice.

COMP3032 – Machne Learning Assignment OnePython Task2: MNIST Digit Classification using PCA and Logistic Regression (8 marks):

1. Load the renowned MNIST (’mnist 784’) dataset, which consists of a large collection of handwritten digit images. Your task is to reduce the number of features first, and then build a binary classification model to distinguish between the digit “7” and all other digits (not “7”).

2. Prform. Principal Component Analysis (PCA) on the feature data to reduce its di-mensionality while retaining 90% of the overall explained variance.

3. Split the data into training and testing sets, using a common split ratio of 80% for training and 20% for testing.

4. Create a Logistic Regression model using the reduced feature dataset.

5. Use this model to predict the labels for both the training and testing dataset.

6. Print the number of principal components preserved. Print the prediction accuracy (proportion of correct predictions) of your model on the training set. Also, print the prediction accuracy, the confusion matrix, and the misclassified digits (i.e. wrong predicitons) of your model on the testing set.

7. Evaluate the model: What do you think of the model generated (good, underfitting, overfitting)? Briefly explain your reasoning.

Documentation:

1. You should write a readme file which contains:

(a) your name and student ID

(b) instructions on how to run your code

(c) test runs and their outputs (You can include screenshots)

(d) descriptions of your findings in Task 1 (item 3 and 6) , Task 2 (item 7)

(e) any limitations or issues if your program does not output the expected results

2. Your code should include necessary comments to clearly explain what each part of the code does and how it works.

Submission:

ALL relevant files (including the readme file, python program and data) should be zipped into a single file named StudentID.zip and submitted via vUWS. Be prepared to demonstrate your program if requested. Please note

1. It is students’ responsibility to ensure that they can upload successfully their sub-missions before the deadline.

2. students’ responsibility to ensure that their programs are runnable on the schools lab machines.

3. It is students’ responsibility to ensure that they keep a copy of their submission.

4         

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值