DATA2001 Sem 2 2024 - 1Python

Java Python DATA2001 Sem 2 2024 - Assignment 1 (Weight: 20%)

Due: 02/Sep/2024 3pm

The aim of this assignment is to gain practical experience in analysing structured data. You must complete this in Python using a Jupyter notebook. You will need to submit a single Jupyter notebook (.ipynb file) via Blackboard.

Dataset:

The dataset for this assignment is provided in blackboard. The dataset contains results from the chemical analysis of different wines. These wines are grown in the same region in Italy but by 3 different cultivators. The analysis determined the quantity of 13 components found in each of the wine samples. The dataset has 178 samples and 14 attributes.

1. Wine (3 different cultivators of wine are represented by the three integers: 1 to 3).

2. Alcohol

3. Malic acid

4. Ash

5. Alcalinity of ash

6. Magnesium

7. Total phenols

8. Flavanoids

9. Nonflavanoid phenols

10. Proanthocyanins

11. Color intensity

12. Hue

13. OD280/OD315 of diluted wines

14. Proline

More information on dataset can be accessed from here: Wine - UCI Machine Learning Repository . Note: Different versions of this dataset that can be found online should not be used for this assignment.

The submitted notebook should address 6 tasks (see marking grid for mark allocation):

1. Data Preparation: Read the dataset using the “pan DATA2001 Sem 2 2024 - Assignment 1Python das” library. Can you identify the missing data both row- and column-wise in the dataset? Handle data quality issues you found in an appropriate way. Explain how you did it along with the reasons of your choice.

2. Exploratory Data Analysis (EDA): Perform. a detailed univariate and bivariate EDA on the columns in the dataset. Produce plots and report your observation for each plot clearly. In case the given dataset has many attributes, you can focus on performing EDA and reporting on just the most important attributes.

3. Find the mean and standard deviation for each type of component for each cultivator of wine and report your findings in a table. Comment on apparent differences between the cultivators of wine (i.e., vignerons).

4. Find the correlation among the numerical columns for each cultivator. Produce visualisations for the correlations and explain the observed results.

5. Perform. k-means clustering on the data. Comment on the number of clusters chosen, on possible limitations, and on any form. of uncertainty about the results. Are the results in agreement with what you observed in the EDA?

6. Perform. principal component analysis on the data. Comment on the results, plot the percentage of variance explained by each principal component. Also plot the principal components which you think are of interest, report your observations and limitations.

Note: The submitted Jupyter notebook should be commented properly and written in a way that makes it easy for the reader to understand. For marking purpose, your code may be rerun to verify the results         

  • 24
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
森林防火应急联动指挥系统是一个集成了北斗定位/GPS、GIS、RS遥感、无线网络通讯、4G网络等技术的现代化智能系统,旨在提高森林火灾的预防和扑救效率。该系统通过实时监控、地图服务、历史数据管理、调度语音等功能,实现了现场指挥调度、语音呼叫通讯、远程监控、现场直播、救火人员生命检测等工作的网络化、智能化、可视化。它能够在火灾发生后迅速组网,确保现场与指挥中心的通信畅通,同时,系统支持快速部署,适应各种极端环境,保障信息的实时传输和历史数据的安全存储。 系统的设计遵循先进性、实用性、标准性、开放性、安全性、可靠性和扩展性原则,确保了技术的领先地位和未来的发展空间。系统架构包括应急终端、无线专网、应用联动应用和服务组件,以及安全审计模块,以确保用户合法性和数据安全性。部署方案灵活,能够根据现场需求快速搭建应急指挥平台,支持高并发视频直播和大容量数据存储。 智能终端设备具备三防等级,能够在恶劣环境下稳定工作,支持北斗+GPS双模定位,提供精确的位置信息。设备搭载的操作系统和处理器能够处理复杂的任务,如高清视频拍摄和数据传输。此外,设备还配备了多种传感器和接口,以适应不同的使用场景。 自适应无线网络是系统的关键组成部分,它基于认知无线电技术,能够根据环境变化动态调整通讯参数,优化通讯效果。网络支持点对点和点对多点的组网模式,具有低功耗、长距离覆盖、强抗干扰能力等特点,易于部署和维护。 系统的售后服务保障包括安装实施服务、系统维护服务、系统完善服务、培训服务等,确保用户能够高效使用系统。提供7*24小时的实时故障响应,以及定期的系统优化和维护,确保系统的稳定运行。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值