python 分类前百分之十,显示分组条形图python的计数和百分比标签

I would like to add count and percentage labels to a grouped bar chart, but I haven't been able to figure it out.

I've seen examples for count or percentage for single bars, but not for grouped bars.

the data looks something like this (not the real numbers):

age_group Mis surv unk death total surv_pct death_pct

0 0-9 1 2 0 3 6 100.0 0.0

1 10-19 2 1 0 1 4 99.9 0.0

2 20-29 0 3 0 1 4 99.9 0.0

3 30-39 0 7 1 2 10 100.0 0.0

`4 40-49 0 5 0 1 6 99.7 0.3

5 50-59 0 6 0 4 10 99.3 0.3

6 60-69 0 7 1 4 12 98.0 2.0

7 70-79 1 8 2 5 16 92.0 8.0

8 80+ 0 10 0 7 17 81.0 19.0

And The chart looks something like this

YWmp6.png

I created the chart with this code:

ax = df.plot(y=['deaths', 'surv'],

kind='barh',

figsize=(20,9),

rot=0,

title= '\n\n surv and deaths by age group')

ax.legend(['Deaths', 'Survivals']);

ax.set_xlabel('\nCount');

ax.set_ylabel('Age Group\n');

How could I add count and percentage labels to the grouped bars? I would like it to look something like this chart

txs21.png

解决方案

Since nobody else has suggested anything, here is one way to approach it with your dataframe structure.

from matplotlib import pyplot as plt

import pandas as pd

df = pd.read_csv("test.txt", delim_whitespace=True)

cat = ['death', 'surv']

ax = df.plot(y=cat,

kind='barh',

figsize=(20, 9),

rot=0,

title= '\n\n surv and deaths by age group')

#making space for the annotation

xmin, xmax = ax.get_xlim()

ax.set_xlim(xmin, 1.05 * xmax)

#connecting bar series with df columns

for cont, col in zip(ax.containers, cat):

#connecting each bar of the series with its absolute and relative values

for rect, vals, perc in zip(cont.patches, df[col], df[col+"_pct"]):

#annotating each bar

ax.annotate(f"{vals} ({perc:.1f}%)", (rect.get_width(), rect.get_y() + rect.get_height() / 2.),

ha='left', va='center', fontsize=10, color='black', xytext=(3, 0),

textcoords='offset points')

ax.set_yticklabels(df.age_group)

ax.set_xlabel('\nCount')

ax.set_ylabel('Age Group\n')

ax.legend(['Deaths', 'Survivals'], loc="lower right")

plt.show()

Sample output:

yMFKP.png

If the percentages per category add up, one could also calculate the percentages on the fly. This would then not necessitate that the percentage columns have exactly the same name structure. Another problem is that the font size of the annotation, the scaling to make space for labeling the largest bar, and the distance between bar and annotation are not interactive and may need fine-tuning.

However, I am not fond of this mixing of pandas and matplotlib plotting functions. I had cases where the axis definition by pandas interfered with matplotlib, and datetime objects ... well, let's not talk about that.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值