实验4.4:matplotlib数据可视化
题目描述:完成5项编程任务。
(1)利用Numpy和Pandas创建商店2021年模拟营业数据data.csv,包含两列(日期date、销量amount)。数据随机365条,date起于2021-01-01止于2021-12-31,amount范围[300,600]。
图4-1
(2)利用pandas读取文件data.csv中数据,创建DataFrame,删除所有缺失值。
(3)利用matplotlib生成折线图,按每天进行统计,显示商店每天销量,图形保存为本地文件day_amount_plot.png。实验效果如图4-2所示。
提示:代码流程参见9-4讲P17的例1,或教材案例
折线图:plt.plot(df["date"],df["amount"],label="day->amount",color="red",linewidth=2)
图4-2
(4)利用matplotlib生成柱状图,按月份进行统计,显示商店每月的销量情况,并把图形保存为本地文件month_amount_bar.png。实验效果如图4-3所示。分析数据,找出相邻两个月最大涨幅,并把涨幅最大的月份写入到文件maxMonth.txt中。
提示:
需要月份销量统计:
柱状图:plt.bar(df1["month"],df1["amount"],label="month->amount",color="blue")
图4-3
相邻月份涨幅:
(5)利用matplotlib生成饼图,按季度进行统计,显示商店4个季度的销量分布情况,并把图形保存为本地文件season_amount_pie.png。实验效果如图4-4所示。
提示:season1= df1[:3]['amount'].sum() #第1季度(01-03月)数据统计
饼图:plt.pie([season1, season2, season3, season4],labels=["season1", "season2", "season3", "season4"])
图4-4
import csv,random,datetime
import numpy as np
import pandas as pd
#创建csv!!!!!重点
with open('Data.csv','w+',encoding='utf-8') as fp1:
wr = csv.writer(fp1,dialect='excel')
wr.writerow(["data","amount"])
startDate = datetime.date(2021, 1, 1)
for i in range(365):
amount = int(random.uniform(300,600))
wr.writerow([startDate,amount])
startDate = startDate+datetime.timedelta(days=1)
#删除所有丢失项目
df = pd.read_csv('Data.csv')
df = df.dropna()
import pandas as pd
import matplotlib.pyplot as plt
#折线图
plt.figure(figsize=(12,6))
plt.plot(df["data"],df["amount"],label="day->amount",color="red",linewidth=2)
plt.title("2021 Day Business Volume of Wpf Store")
plt.ylabel("amount")
plt.xlabel("day")
plt.xlim("2021-01-01", "2021-12-31")
plt.ylim(300, 600)
plt.legend()
plt.savefig("day_amount_plot.png")
plt.show()
#柱状图
#处理数据
df1 = df#初始化新df1
df1["month"] = df1["data"].map(lambda x: x[:x.rindex('-')])#获得月份索引
df1 = df1.groupby(by="month", as_index=False).sum()#统计月份销量
#print(df1)#可查看df1的内容
#作图
plt.figure(figsize=(12,6))
plt.title("2021 Month Business Volume of Wpf Store")
plt.bar(df1["month"], df1["amount"], label="month->amount", color="blue")
plt.ylabel("amount")
plt.xlabel("mouth")
plt.legend()
plt.savefig("day_amount_bar.png")
plt.show()
#写入数据
df2 = df1['amount'].diff()#计算amount的元素差异
m = df2.nlargest(1).keys()[0]#降序获得第一个最大值对应的索引
with open("max_month.txt", 'w') as txtfile:
txtfile.write(df1.loc[m, "month"])#找到m对应的月份
txtfile.close()
#饼图
#处理数据
season1= df1[:3]['amount'].sum()
season2= df1[3:6]['amount'].sum()
season3= df1[6:9]['amount'].sum()
season4= df1[9:12]['amount'].sum()
#作图
plt.figure(figsize=(12,6))
plt.pie([season1, season2, season3, season4],labels=["season1", "season2", "season3", "season4"])
plt.title("2021 Season Business Volume of Wpf Store")
plt.savefig("day_amount_pie.png")
plt.show()