3、数据分析并可视化
(1)配送服务是否存在问题
a 月份维度
data['货品交货状况'] = data['货品交货状况'].str.strip()
data1 = data.groupby(['month','货品交货状况']).size().unstack()
data1['按时交货率'] = data1['按时交货']/(data1['按时交货']+data1['晚交货'])
print(data1['按时交货率'])
输出结果:
b 销售区域
data['货品交货状况'] = data['货品交货状况'].str.strip()
data1 = data.groupby(['销售区域','货品交货状况']).size().unstack()
data1['按时交货率'] = data1['按时交货']/(data1['按时交货']+data1['晚交货'])
print(data1.sort_values(by='按时交货率',ascending=False))
输出结果:
c 货品维度
data['货品交货状况'] = data['货品交货状况'].str.strip()
data1 = data.groupby(['货品','货品交货状况']).size().unstack()
data1['按时交货率'] = data1['按时交货']/(data1['按时交货']+data1['晚交货'])
print(data1.sort_values(by='按时交货率',ascending=False))
输出结果:
d 货品和销售区域结合
data['货品交货状况'] = data['货品交货状况'].str.strip()
data1 = data.groupby(['销售区域','货品','货品交货状况']).size().unstack()
data1['按时交货率'] = data1['按时交货']/(data1['按时交货']+data1['晚交货'])
print(data1.sort_values(by='按时交货率',ascending=False))
输出结果:
(2)是否存在上有潜力的销售区域
a 月份维度
data1 = data.groupby(['month','货品'])['数量'].sum().unstack()
data1.plot(kind='line')
plt.show()
b 不同区域
data1 = data.groupby(['销售区域','货品'])['数量'].sum().unstack()
print(data1)
输出结果:
c 月份和区域
data1 = data.groupby(['month','销售区域','货品'])['数量'].sum().unstack()
print(data1['货品2'])
输出结果:
(3)商品是否存在质量问题
data['货品用户反馈'] = data['货品用户反馈'].str.strip() #去除收尾空格
data1 = data.groupby(['货品','销售区域'])['货品用户反馈'].value_counts().unstack()
data1['拒货率'] = data1['拒货']/data1.sum(axis=1)
data1['返修率'] = data1['返修']/data1.sum(axis=1)
data1['合格率'] = data1['质量合格']/data1.sum(axis=1)
print(data1.sort_values(['合格率','返修率','拒货率'],ascending=False))
输出结果: