IF you want the assignment's solution, please add my wechat: fuji12345
1 Objectives
The goal of this assignment is to help you tie together all the concepts you have learnt in the first half of this course
in the lectures and assignments. To aid you in completing this assignment, you should review the major aspects of
the course that have been explored so far, such as:
• Data understanding, cleansing, and pre-processing,
• Machine learning concepts,
• CRISP-DM and pipelines in general,
• Feature manipulation, including feature selection, feature construction and imputation,
• Statistical design and analysis of results.
These topics are (to be) covered in Weeks 1–7. Research into online resources for AI is encouraged, where the
rabbit-hole
1
will provide useful jumping off points for further exploration.
2 Question Description
Many of us use some music streaming app to listen to music. These apps usually make personalized playlists to cater
to each user’s need. But what is the logic behind the personalized playlist? One general example is to have a
Music
Genre Classification System
. More specifically, creating a machine learning model, which classifies music samples
into different genres using various audio features.
The overall aim of this assignment is to develop the best possible machine learning system to
predict the genres of
music
. The task is to classify the music tracks into one of ten genres based on the provided audio features. The
data for each track includes both textual features e.g. artist and track names, numerical descriptors e.g. duration
and various audio features. The hope is that your model will identify the relationship between music genres and the
audio features.
We have set up a Kaggle InClass Competition
2
to facilitate finding the best machine learning system for officials to
use. You are expected to analyse the provided data, design and improve your own machine learning pipeline, and
consider the consequences of applying your pipeline to this data.
Note the data is real. Thus, you could attempt to find the original dataset and create a look-up table. This is not
permitted as it misses the point of the course. We want to see your analyses, rather than see perfect results.