CITS1401 Computational Thinking with Python
Project 1, Semester 2, 2024
Page 1 of 9
Department of Computer Science and Soffware Engineering
The University of Western Australia
CITS1401
Computational Thinking with Python
Project 1, Semester 2, 2024
(Individual project)
Submission deadline: 23:59 PM, 13 September 2024.
Total Marks: 30
Project Submission Guidelines:
You should construct a Python 3 program containing your solution to the given problem and
submit your program electronically on Moodle. The name of the file containing your code
should be your student ID e.g. 12345678.py. No other method of submission is allowed. Please
note that this is an individual project.
• Your program will be automatically run on Moodle for sample test cases provided in
the project sheet if you click the “check” link. However, this does not test all required
criteria and your submission will be thoroughly tested manually for grading purposes
after the due date. Remember you need to submit the program as a single file and copypaste
the same program in the provided text box.
• You have only one attempt to submit, so don’t submit until you are satisfied with your
attempt.
• All open submissions at the time of the deadline will be automatically submitted. There
is no way in the system to open/modify/reverse your submission.
• You must submit your project before the deadline listed above. Following UWA policy,
a late penalty of 5% will be deducted for each day (or part day) i.e., 24 hours after the
deadline, that the assignment is submitted.
• No submissions will be allowed after 7 days following the deadline except approved
special consideration cases.
You are expected to have read and understood the University's guidelines on academic conduct.
In accordance with this policy, you may discuss with other students the general principles
required to understand this project, but the work you submit must be the result of your own
effort. Plagiarism detection, and other systems for detecting potential malpractice, will CITS1401 Computational Thinking with Python
Project 1, Semester 2, 2024
Page 2 of 9
therefore be used. Besides, if what you submit is not your own work then you will have learnt
little and will therefore, likely, fail the final exam.
Project Overview:
In the rapidly expanding world of e-commerce, platforms like Amazon provide vast amounts
of data that can offer valuable insights into various aspects of product performance. This project
aims to analyze Amazon data for different products within specific categories, utilizing key
parameters such as product ID, product name, category, discounted price, actual price, ratings,
rating count etc., The data set includes a diverse range of categories, each with multiple
products, allowing us to identify trends and patterns specific to each category.
You are required to write a Python 3 program that will read two different files: a CSV file and
a TXT file. Your program will perform four different tasks outlined below. While the CSV file
is required to solve all the tasks (Tasks1-4), the TXT file is only required for the last task (Task
4).
After reading the CSV file, your program is required to complete the following:
• Task 1: Identify Extreme Discount Prices
Find the product ID with the highest discounted price and the product ID with the
lowest discounted price for a specific category.
• Task 2: Summarize Price Distribution
Provide a summary of the ‘actual price’ distribution i.e., mean, median and mean
absolute deviation of products for a specific category, considering only the products
with a rating count higher than 1000.
• Task 3: Calculate Standard Deviation of Discounted Percentages
Calculate the standard deviation of the discounted percentages for products with rating
in the range 3.3≤rating≤4.3, for each category.
• Task 4: Correlate Sales Data
Find the correlations between the sales of the products identified in Task 1 (products
with highest and lowest discounted prices for a specific category).
Steps:
o Read the TXT file which contains the sales data for several years, such as 1998-
2021. Each line lists product IDs and the units sold for that year. If a product ID
is not mentioned in a line, it means zero units sold for that year. CITS1401 Computational Thinking with Python
Project 1, Semester 2, 2024
Page 3 of 9
o Create two lists, one for the sales of the product with the highest discounted
price and another for the sales of the product with the lowest discounted price
identified in Task 1.
o Process each line of the TXT file to determine the number of units sold each
year.
o Each list should have one entry per year, with the total number of entries
matching the number of lines in the TXT file.
Finally, calculate the correlation coefficient between the two sales lists.
Requirements:
1) You are not allowed to import any external or internal module in python. While use of
many of these modules, e.g., csv or math is a perfectly sensible thing to do in production
setting, it takes away much of the point of different aspects of the project, which is about getting
practice opening text files, processing text file data, and use of basic Python structures, in this
case lists and loops.
2) Ensure your program does NOT call the input() function at any time. Calling the
input() function will cause your program to hang, waiting for input that automated testing
system will not provide (in fact, what will happen is that if the marking program detects the
call(s), it will not test your code at all which may result in zero grade).
3) Your program should also not call print()function at any time except for the case of
graceful termination (if needed). If your program encounters an error state and exits gracefully,
it should return a correlation/standard deviation/mean/median value of zero and print an
appropriate error message. At no point should you print the program’s outputs or provide a
printout of the program’s progress in calculating such outputs. Outputs should be returned by
the program instead.
4) Do not assume that the input file names will end in .csv or .txt. File name suffixes such
as .csv and .txt are not mandatory in systems other than Microsoft Windows. Do not
enforce within your program that the file must end with a specific extension, nor should you
attempt to add an extension to the provided file name. Doing so can result in loss of marks.