数据来源:https://www.kaggle.com/starbucks/store-locations/data
1.中国星巴克数最多的15个城市
转为ansi避免出现乱码
import pandas as pd
from matplotlib import pyplot as plt
file_path = "./directory.csv"
df = pd.read_csv(file_path, encoding='ansi')
china_data = df[df["Country"] == "CN"]
grouped = china_data.groupby(by="City").count()['Brand']
# 统计店铺数最多的15个城市
city_data = grouped.sort_values(ascending=False)[:15]
plt.rcParams['font.sans-serif'] = ['SimHei']
_x = city_data.index
_y = city_data.values
plt.figure(figsize=(15, 8))
rects = plt.bar(range(len(_x)), _y)
plt.xticks(range(len(_x)), _x)
plt.xlabel('城市')
plt.ylabel('店铺数量')
for rect in rects:
height = rect.get_height()
plt.text(rect.get_x() + rect.get_width() / 2, height + 2, str(height), ha='center')
plt.title('中国星巴克店铺最多的15个城市')
plt.show()
统计精确性会因数据有些偏差
如图会被算为两个市
2.各国星巴克店铺数对比
import pandas as pd
from matplotlib import pyplot as plt
file_path = "./directory.csv"
df = pd.read_csv(file_path, encoding='ansi')
grouped = df.groupby(by="Country").count()['Brand']
plt.rcParams['font.sans-serif'] = ['SimHei']
_x = grouped.index
_y = grouped.values
plt.figure(figsize=(20, 8))
rects = plt.bar(range(len(_x)), _y)
plt.xticks(range(len(_x)), _x)
plt.xlabel('国家')
plt.ylabel('店铺数量')
for rect in rects:
height = rect.get_height()
plt.text(rect.get_x() + rect.get_width() / 2, height + 2, str(height), ha='center')
plt.title('各国星巴克店铺数对比')
plt.show()
US一骑绝尘,第二的CN也没有达到零头