python条形堆积图_如何使用python(Pandas)堆积条形集群

所以,我最终找到了一个技巧(编辑:请参阅下面的使用seaborn和longform数据帧):

解决方案与熊猫和matplotlib

这是一个更完整的例子:

import pandas as pd

import matplotlib.cm as cm

import numpy as np

import matplotlib.pyplot as plt

def plot_clustered_stacked(dfall, labels=None, title="multiple stacked bar plot", H="/", **kwargs):

"""Given a list of dataframes, with identical columns and index, create a clustered stacked bar plot.

labels is a list of the names of the dataframe, used for the legend

title is a string for the title of the plot

H is the hatch used for identification of the different dataframe"""

n_df = len(dfall)

n_col = len(dfall[0].columns)

n_ind = len(dfall[0].index)

axe = plt.subplot(111)

for df in dfall : # for each data frame

axe = df.plot(kind="bar",

linewidth=0,

stacked=True,

ax=axe,

legend=False,

grid=False,

**kwargs) # make bar plots

h,l = axe.get_legend_handles_labels() # get the handles we want to modify

for i in range(0, n_df * n_col, n_col): # len(h) = n_col * n_df

for j, pa in enumerate(h[i:i+n_col]):

for rect in pa.patches: # for each index

rect.set_x(rect.get_x() + 1 / float(n_df + 1) * i / float(n_col))

rect.set_hatch(H * int(i / n_col)) #edited part

rect.set_width(1 / float(n_df + 1))

axe.set_xticks((np.arange(0, 2 * n_ind, 2) + 1 / float(n_df + 1)) / 2.)

axe.set_xticklabels(df.index, rotation = 0)

axe.set_title(title)

# Add invisible data to add another legend

n=[]

for i in range(n_df):

n.append(axe.bar(0, 0, color="gray", hatch=H * i))

l1 = axe.legend(h[:n_col], l[:n_col], loc=[1.01, 0.5])

if labels is not None:

l2 = plt.legend(n, labels, loc=[1.01, 0.1])

axe.add_artist(l1)

return axe

# create fake dataframes

df1 = pd.DataFrame(np.random.rand(4, 5),

index=["A", "B", "C", "D"],

columns=["I", "J", "K", "L", "M"])

df2 = pd.DataFrame(np.random.rand(4, 5),

index=["A", "B", "C", "D"],

columns=["I", "J", "K", "L", "M"])

df3 = pd.DataFrame(np.random.rand(4, 5),

index=["A", "B", "C", "D"],

columns=["I", "J", "K", "L", "M"])

# Then, just call :

plot_clustered_stacked([df1, df2, df3],["df1", "df2", "df3"])

它给出了:

多个堆积条形图

您可以通过传递cmap参数来更改栏的颜色:

plot_clustered_stacked([df1, df2, df3],

["df1", "df2", "df3"],

cmap=plt.cm.viridis)

用seaborn解决方案:

给定相同的df1,df2,df3,我将它们转换为长形式:

df1["Name"] = "df1"

df2["Name"] = "df2"

df3["Name"] = "df3"

dfall = pd.concat([pd.melt(i.reset_index(),

id_vars=["Name", "index"]) # transform in tidy format each df

for i in [df1, df2, df3]],

ignore_index=True)

seaborn的问题在于它本身不会堆叠条形图,因此诀窍是将每个条形图的累积总和绘制在彼此之上:

dfall.set_index(["Name", "index", "variable"], inplace=1)

dfall["vcs"] = dfall.groupby(level=["Name", "index"]).cumsum()

dfall.reset_index(inplace=True)

>>> dfall.head(6)

Name index variable value vcs

0 df1 A I 0.717286 0.717286

1 df1 B I 0.236867 0.236867

2 df1 C I 0.952557 0.952557

3 df1 D I 0.487995 0.487995

4 df1 A J 0.174489 0.891775

5 df1 B J 0.332001 0.568868

然后循环遍历每组variable并绘制累积总和:

c = ["blue", "purple", "red", "green", "pink"]

for i, g in enumerate(dfall.groupby("variable")):

ax = sns.barplot(data=g[1],

x="index",

y="vcs",

hue="Name",

color=c[i],

zorder=-i, # so first bars stay on top

edgecolor="k")

ax.legend_.remove() # remove the redundant legends

多堆栈条形图seaborn

它缺乏我认为可以轻松添加的传奇。问题是,为了区分数据框而不是阴影(可以很容易地添加),我们有一个亮度梯度,而且对于第一个有点太亮了,我真的不知道如何改变它而不改变每一个矩形逐个(如第一个解决方案)。

如果你不理解代码中的某些内容,请告诉我。

随意重复使用CC0下的代码。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值