【Python百日进阶-数据分析】Day146 - plotly小提琴图:px.violin()/go.violin()

四、实例

4.1 Plotly Express 的小提琴图

小提琴图是数字数据的统计表示。它类似于箱线图,在每一侧都增加了一个旋转的核密度图。

用于可视化分布的小提琴图的替代方法包括直方图、箱线图、ECDF 图和条形图。

4.1.1 Plotly Express 的基本小提琴图

import plotly.express as px

df = px.data.tips()
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''
fig = px.violin(df, y="total_bill")
fig.show()

在这里插入图片描述

4.1.2 带框和数据点的小提琴图

import plotly.express as px

df = px.data.tips()
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''

fig = px.violin(df, y="total_bill", box=True, # 在小提琴内部绘制方框图
                points='all', # 可以是 'outliers', or False
               )
fig.show()

在这里插入图片描述

4.1.3 多个小提琴图

import plotly.express as px

df = px.data.tips()
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''
fig = px.violin(df, y="tip", x="smoker", color="sex", box=True, points="all",
          hover_data=df.columns)
fig.show()

在这里插入图片描述

4.1.4 叠加的小提琴图

import plotly.express as px

df = px.data.tips()
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''
fig = px.violin(df, y="tip", color="sex",
                # 默认violinmode是'group'
                violinmode='overlay', # 把小提琴放在彼此的上面
                hover_data=df.columns)
fig.show()

在这里插入图片描述

4.2 graph_objects的小提琴图

如果Plotly Express 没有提供好的起点,您可以使用plotly.graph_objects的go.Violin所有选项go.Violin都记录在参考https://plotly.com/python/reference/violin/中

4.2.1 基本小提琴图

import plotly.graph_objects as go

import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''

fig = go.Figure(data=go.Violin(y=df['total_bill'], box_visible=True, line_color='black',
                               meanline_visible=True, fillcolor='lightseagreen', opacity=0.6,
                               x0='Total Bill'))

fig.update_layout(yaxis_zeroline=False)
fig.show()

在这里插入图片描述

4.2.2 多条小提琴迹线

import plotly.graph_objects as go

import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''

fig = go.Figure()

days = ['Thur', 'Fri', 'Sat', 'Sun']

for day in days:
    fig.add_trace(go.Violin(x=df['day'][df['day'] == day],
                            y=df['total_bill'][df['day'] == day],
                            name=day,
                            box_visible=True,
                            meanline_visible=True))

fig.show()

在这里插入图片描述

4.2.3 分组小提琴图

import plotly.graph_objects as go

import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''

fig = go.Figure()

fig.add_trace(go.Violin(x=df['day'][ df['sex'] == 'Male' ],
                        y=df['total_bill'][ df['sex'] == 'Male' ],
                        legendgroup='M', scalegroup='M', name='M',
                        line_color='blue')
             )
fig.add_trace(go.Violin(x=df['day'][ df['sex'] == 'Female' ],
                        y=df['total_bill'][ df['sex'] == 'Female' ],
                        legendgroup='F', scalegroup='F', name='F',
                        line_color='orange')
             )

fig.update_traces(box_visible=True, meanline_visible=True)
fig.update_layout(violinmode='group')

fig.show()

在这里插入图片描述

4.2.4 分裂小提琴图

import plotly.graph_objects as go

import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''

fig = go.Figure()

fig.add_trace(go.Violin(x=df['day'][ df['smoker'] == 'Yes' ],
                        y=df['total_bill'][ df['smoker'] == 'Yes' ],
                        legendgroup='Yes', scalegroup='Yes', name='Yes',
                        side='negative',
                        line_color='blue')
             )
fig.add_trace(go.Violin(x=df['day'][ df['smoker'] == 'No' ],
                        y=df['total_bill'][ df['smoker'] == 'No' ],
                        legendgroup='No', scalegroup='No', name='No',
                        side='positive',
                        line_color='orange')
             )
fig.update_traces(meanline_visible=True)
fig.update_layout(violingap=0, violinmode='overlay')

fig.show()

在这里插入图片描述

4.2.5 高级小提琴图

import plotly.graph_objects as go

import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''

fig = go.Figure()

pointpos_male = [-0.9,-1.1,-0.6,-0.3]
pointpos_female = [0.45,0.55,1,0.4]
show_legend = [True,False,False,False]

fig = go.Figure()

for i in range(0,len(pd.unique(df['day']))):
    fig.add_trace(go.Violin(x=df['day'][(df['sex'] == 'Male') &
                                        (df['day'] == pd.unique(df['day'])[i])],
                            y=df['total_bill'][(df['sex'] == 'Male')&
                                               (df['day'] == pd.unique(df['day'])[i])],
                            legendgroup='M', scalegroup='M', name='M',
                            side='negative',
                            pointpos=pointpos_male[i], # 在哪里定位点
                            line_color='lightseagreen',
                            showlegend=show_legend[i])
             )
    fig.add_trace(go.Violin(x=df['day'][(df['sex'] == 'Female') &
                                        (df['day'] == pd.unique(df['day'])[i])],
                            y=df['total_bill'][(df['sex'] == 'Female')&
                                               (df['day'] == pd.unique(df['day'])[i])],
                            legendgroup='F', scalegroup='F', name='F',
                            side='positive',
                            pointpos=pointpos_female[i],
                            line_color='mediumpurple',
                            showlegend=show_legend[i])
             )

# 更新所有跟踪共享的特征
fig.update_traces(meanline_visible=True,
                  points='all', # 显示所有要点
                  jitter=0.05,  # 在点上添加一些抖动以获得更好的可见性
                  scalemode='count') # 用总计数缩放绘图区域
fig.update_layout(
    title_text="总账单分配<br><i>按每个性别的账单数量缩放",
    violingap=0, violingroupgap=0, violinmode='overlay')

fig.show()

在这里插入图片描述

4.2.6 脊线图

脊线图(以前称为 Joy Plot)显示了几个组的数值分布。它们可用于可视化分布随时间或空间的变化。

import plotly.graph_objects as go
from plotly.colors import n_colors
import numpy as np
np.random.seed(1)

# 12组正态分布的随机数据,平均值和标准差都在增加
data = (np.linspace(1, 2, 12)[:, np.newaxis] * np.random.randn(12, 200) +
            (np.arange(12) + 2 * np.random.random(12))[:, np.newaxis])

colors = n_colors('rgb(5, 200, 200)', 'rgb(200, 10, 10)', 12, colortype='rgb')

fig = go.Figure()
for data_line, color in zip(data, colors):
    fig.add_trace(go.Violin(x=data_line, line_color=color))

fig.update_traces(orientation='h', side='positive', width=3, points=False)
fig.update_layout(xaxis_showgrid=False, xaxis_zeroline=False)
fig.show()

在这里插入图片描述

4.2.7 只有点的小提琴图

条形图就像一个带有点的小提琴图,没有小提琴:

import plotly.express as px
df = px.data.tips()
print(df)
'''
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
'''
fig = px.strip(df, x='day', y='tip')
fig.show()

在这里插入图片描述

4.2.8 Dash中的应用

import plotly.graph_objects as go # or plotly.express as px
fig = go.Figure() # or any Plotly Express function e.g. px.bar(...)
# fig.add_trace( ... )
# fig.update_layout( ... )

import dash
import dash_core_components as dcc
import dash_html_components as html

app = dash.Dash()
app.layout = html.Div([
    dcc.Graph(figure=fig)
])

app.run_server(debug=True, use_reloader=False)  # Turn off reloader if inside Jupyter
  • 2
    点赞
  • 22
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
当使用seaborn绘制小提琴时,可以通过`sns.violinplot()`函数的参数来控制各个元素的样式和布局。下面是一些常用的参数: - `x`, `y`: 指定数据的横纵坐标,可以是DataFrame或Series中的列名,也可以是numpy数组。 - `hue`: 按照某个分类变量对数据进行分组,并用不同颜色的小提琴表示不同组别的数据。 - `data`: 指定数据源,可以是DataFrame或Series。 - `split`: 是否将小提琴分成两半,分别表示两个分类变量的数据。默认为False。 - `inner`: 小提琴内部的样式,可以是“box”,“quartile”,“point”和“stick”中的一种。默认为“box”,表示绘制小提琴的中位数和四分位数范围。 - `scale`: 小提琴的宽度缩放因子,可以是“area”,“count”,“width”中的一种。默认为“area”,表示根据样本数量自适应调整小提琴的宽度。 - `bw`: 控制内核密度估计的带宽大小。默认为"scott",可选值有"scott"、"silverman"和float类型的数值。 - `cut`: 控制小提琴的截断方式,可以是numpy.percentile的参数或者是一个浮点数。默认为None,表示不截断。 - `color`: 小提琴的颜色。 - `palette`: 用于绘制分类变量的小提琴的颜色调色板。 - `linewidth`: 小提琴边缘线宽度。 - `width`: 小提琴的宽度。 - `outer`: 是否在小提琴外部绘制观测值的分布。 - `inner_c`: 小提琴内部的颜色。 - `ax`: 用于绘制小提琴的matplotlib子对象。 使用这些参数可以灵活控制小提琴的样式和布局。例如,可以通过以下代码绘制一个带有两个分类变量和观测值散点图小提琴: ```python import seaborn as sns import matplotlib.pyplot as plt tips = sns.load_dataset("tips") sns.violinplot(x="day", y="total_bill", hue="sex", data=tips, split=True, inner="stick") sns.swarmplot(x="day", y="total_bill", hue="sex", data=tips, dodge=True, color=".2") plt.show() ``` 输出结果如下所示: ![image.png](attachment:image.png) 在这个例子中,我们使用了`tips`数据集中的`day`和`total_bill`两个变量,按照`sex`变量进行了分组,并使用`split=True`将小提琴分成了两半。另外,我们使用了`inner="stick"`将小提琴的内部样式设置为“stick”,同时使用`sns.swarmplot()`函数绘制观测值散点图,并使用`dodge=True`将散点图按照`hue`变量进行了分组。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

岳涛@心馨电脑

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值