>python可视化神器altair

python可视化神器altair

今天介绍一个python库altair,它的语法与r的ggplot有点类似

对中文的兼容性也很好,以一幅简单的散点图举例:

安装说明:

pip install altair
pip install vega-datasets#注意这里是"-"不是"_",我们要使用到其中的数据
import altair as alt
from vega_datasets import data
cars = data.cars()
cars

alt.Chart(cars).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    color='Origin',
    shape='Origin'
).interactive()

输出以下图形,点击旁边的三个点,还能将其保存为各种形式的图片。

image-20221214133847325

可以发现它的语法也是及其简单:

  • cars是我们所需要的数据,他是一个数据框(dataframe的形式)

  • make-point 就是散点图

  • x=‘Horsepower’ , y='Miles_per_Gallon’分别对应我们的x轴和y轴数据

  • color=‘Origin’ 根据产地来映射颜色,这与ggplot的语法很相似

  • shape=‘Origin’,这里就是根据产地来映射点的形状

  • interactive() 生成交互式图片,效果如下

请添加图片描述

一.些简单图形的绘制

(一).柱状图

语法很简单

import altair as alt
import pandas as pd

source = pd.DataFrame({
    'a': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
})

alt.Chart(source).mark_bar().encode(
    x='a',
    y='b',
    color="a"
)

image-20221214140900377

1. 然后我们还可以设置高亮柱状图的某一根柱子,其他柱子设置为一样的颜色:

import altair as alt
import pandas as pd

source = pd.DataFrame({
    'a': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
})

alt.Chart(source).mark_bar().encode(
    x='a:O',
    y='b:Q',
    color=alt.condition(
        alt.datum.a=="A",#这里设置条件,如果a的值是"A",需要改动的只有a这个地方和"A"这个地方,后者是前者满足的条件
        alt.value("red"),#如果满足上面的条件颜色就变成红色
        alt.value("yellow")#如果不满足就变成黄色
    )
).properties(width=600,height=400)#这里的height和width分别设置图片的大小和高度

image-20221214153017888

2. 翻转图片,同时添加图片标注,在图上加上数据

呃呃呃,其实翻转图片,就是x和y轴数据互换

import altair as alt
import pandas as pd

source = pd.DataFrame({
    'a': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
})

bars=   alt.Chart(source).mark_bar().encode(
    x='b:Q',
    y='a:O',
    color="a")
text = bars.mark_text(
    align='right',#在这里选择一个['left', 'center', 'right']
    baseline='middle',
    dx=10  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='a'#这里是添加数据
)
bars+text

image-20221214154018862

3.在图形上添加线条

import altair as alt
import pandas as pd

source = pd.DataFrame({
    'a': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
})

bars=   alt.Chart(source).mark_bar().encode(
    x='a',
    y='b',
    color="a")

rule = alt.Chart(source).mark_rule(color='red').encode(
    y='mean(b)',
)
(bars+rule).properties(width=600,height=400)

image-20221214155142287

4. 组合图,柱状图+折线图

首先我们需要固定好x轴

import altair as alt
from vega_datasets import data
import pandas as pd

source = pd.DataFrame({
    'a': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
})
base = alt.Chart(source).encode(x='a:O')

bar = base.mark_bar().encode(y='b:Q')

line =  base.mark_line(color='red').encode(
    y='b:Q'
)

(bar + line).properties(width=600)

image-20221214155933379

(二).热力图

import altair as alt
import numpy as np
import pandas as pd

# Compute x^2 + y^2 across a 2D grid
x, y = np.meshgrid(range(-5, 5), range(-5, 5))
z = x ** 2 + y ** 2

# Convert this grid to columnar data expected by Altair
source = pd.DataFrame({'x': x.ravel(),
                     'y': y.ravel(),
                     'z': z.ravel()})

alt.Chart(source).mark_rect().encode(
    x='x:O',
    y='y:O',
    color='z:Q'
)

image-20221214141345469

(三).直方图

统计不同范围的数字出现的次数

这里还是以我们一开始cars数据举例说明:

import altair as alt
from vega_datasets import data
cars = data.cars()
cars
alt.Chart(cars).mark_bar().encode(
    alt.X("Displacement", bin=True),
    y='count()',
    color="Origin"
)

image-20221214142326999

(四).线图

可以用来画函数曲线,比如:
y = sin ⁡ x 5 \displaystyle y=\frac{\sin x}{5} y=5sinx

import altair as alt
import numpy as np
import pandas as pd

x = np.arange(100)
source = pd.DataFrame({
  'x': x,
  'f(x)': np.sin(x / 5)
})

alt.Chart(source).mark_line().encode(
    x='x',
    y='f(x)'
)

image-20221214142546699

(五).带有鼠标提示的散点图

就是当你点击某个位置的时候,会给你相应的信息,比如说它的坐标

比如我在下面的代码中设置了tooltip,当我点击某个点时就会显示出相应的名称,归属地,马力

import altair as alt
from vega_datasets import data

source = data.cars()

alt.Chart(source).mark_circle(size=60).encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    color='Origin',
    tooltip=['Name', 'Origin', 'Horsepower', 'Miles_per_Gallon']
).interactive()

请添加图片描述

(六).堆积面积图

比如下面的代码,这里的x就是不同的年份,y就是使用不同原料的净发电量

import altair as alt
from vega_datasets import data

source = data.iowa_electricity()
source
alt.Chart(source).mark_area().encode(
    x="year:T",
    y="net_generation:Q",
    color="source:N"
)

image-20221214143550277

(七).扇形图

import pandas as pd
import altair as alt

source = pd.DataFrame({"category": [1, 2, 3, 4, 5, 6], "value": [4, 6, 10, 3, 7, 8]})

alt.Chart(source).mark_arc(innerRadius=50).encode(
    theta=alt.Theta(field="value", type="quantitative"),
    color=alt.Color(field="category", type="nominal"),
)

image-20221214161547967

二.进阶操作

1. 折线图

1.制作一个带有95%置信区间带的折线图。

## 带有置信区间
import altair as alt
from vega_datasets import data

source = data.cars()

line = alt.Chart(source).mark_line().encode(
    x='Year',
    y='mean(Miles_per_Gallon)'
)

band = alt.Chart(source).mark_errorband(extent='ci').encode(
    x='Year',
    y=alt.Y('Miles_per_Gallon', title='Miles/Gallon'),
)

band + line

image-20221214160510796

2.折线图标记

#折线图标记
import altair as alt
import numpy as np
import pandas as pd

x = np.arange(100)
source = pd.DataFrame({
  'x': x,
  'f(x)': np.sin(x / 5)
})

alt.Chart(source).mark_line(
    point=alt.OverlayMarkDef(color="red")
).encode(
    x='x',
    y='f(x)'
)

image-20221214160756661

3.在不同的位置设置折线图线条的粗细

#线条粗细随之变化
import altair as alt
from vega_datasets import data

source = data.wheat()

alt.Chart(source).mark_trail().encode(
    x='year:T',
    y='wheat:Q',
    size='wheat:Q'
)

image-20221214161027315

2.标准的面积堆积图

区别就是他会堆满整个图片image-20221214161332328

import altair as alt
from vega_datasets import data

source = data.iowa_electricity()

alt.Chart(source).mark_area().encode(
    x="year:T",
    y=alt.Y("net_generation:Q", stack="normalize"),
    color="source:N"
)

3. 带有缺口的扇形图

import numpy as np
import altair as alt

alt.Chart().mark_arc(color="gold").encode(
    theta=alt.datum((5 / 8) * np.pi, scale=None),
    theta2=alt.datum((19 / 8) * np.pi),
    radius=alt.datum(100, scale=None),
)

image-20221214161654529

1.饼图

import pandas as pd
import altair as alt

source = pd.DataFrame({"category": [1, 2, 3, 4, 5, 6], "value": [4, 6, 10, 3, 7, 8]})

alt.Chart(source).mark_arc().encode(
    theta=alt.Theta(field="value", type="quantitative"),
    color=alt.Color(field="category", type="nominal"),
)

image-20221214161804227

2.辐射状的饼图

import pandas as pd
import altair as alt

source = pd.DataFrame({"values": [12, 23, 47, 6, 52, 19]})

base = alt.Chart(source).encode(
    theta=alt.Theta("values:Q", stack=True),
    radius=alt.Radius("values", scale=alt.Scale(type="sqrt", zero=True, rangeMin=20)),
    color="values:N",
)

c1 = base.mark_arc(innerRadius=20, stroke="#fff")

c2 = base.mark_text(radiusOffset=10).encode(text="values:Q")

c1 + c2

image-20221214162318304

4.散点图进阶

1.带有误差棒的散点图

import altair as alt
import pandas as pd
import numpy as np

# generate some data points with uncertainties
np.random.seed(0)
x = [1, 2, 3, 4, 5]
y = np.random.normal(10, 0.5, size=len(x))
yerr = 0.2

# set up data frame
source = pd.DataFrame({"x": x, "y": y, "yerr": yerr})

# the base chart
base = alt.Chart(source).transform_calculate(
    ymin="datum.y-datum.yerr",
    ymax="datum.y+datum.yerr"
)

# generate the points
points = base.mark_point(
    filled=True,
    size=50,
    color='black'
).encode(
    x=alt.X('x', scale=alt.Scale(domain=(0, 6))),
    y=alt.Y('y', scale=alt.Scale(zero=False))
)

# generate the error bars
errorbars = base.mark_errorbar().encode(
    x="x",
    y="ymin:Q",
    y2="ymax:Q"
)

points + errorbars

image-20221214162544140

2. 散点图加标签

#散点图加标签
import altair as alt
import pandas as pd

source = pd.DataFrame({
    'x': [1, 3, 5, 7, 9],
    'y': [1, 3, 5, 7, 9],
    'label': ['我', '是', '你', '爸', '爸']
})

points = alt.Chart(source).mark_point().encode(
    x='x:Q',
    y='y:Q'
)

text = points.mark_text(
    align='left',
    baseline='middle',
    dx=7
).encode(
    text='label'
)

points + text

image-20221214170203065

5. 世界地图

import altair as alt
from vega_datasets import data

# Data generators for the background
sphere = alt.sphere()
graticule = alt.graticule()

# Source of land data
source = alt.topo_feature(data.world_110m.url, 'countries')

# Layering and configuring the components
alt.layer(
    alt.Chart(sphere).mark_geoshape(fill='lightblue'),
    alt.Chart(graticule).mark_geoshape(stroke='white', strokeWidth=0.5),
    alt.Chart(source).mark_geoshape(fill='ForestGreen', stroke='black')
).project(
    'naturalEarth1'
).properties(width=600, height=400).configure_view(stroke=None)

image-20221214170416101

三.图片的保存

你可以将其保存为svg,png,html,pdf,json等格式

import altair as alt
from vega_datasets import data

chart = alt.Chart(data.cars.url).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color='Origin:N'
)

chart.save('chart.json')
chart.save('chart.html')
chart.save('chart.png')
chart.save('chart.svg')
chart.save('chart.pdf')	

同时设置保存图片的大小

chart.save('chart.png', scale_factor=2.0)

四.图片一些属性的配置

比如说给图片添加标题:

#世界地图
import altair as alt
from vega_datasets import data

# Data generators for the background
sphere = alt.sphere()
graticule = alt.graticule()

# Source of land data
source = alt.topo_feature(data.world_110m.url, 'countries')

# Layering and configuring the components
alt.layer(
    alt.Chart(sphere).mark_geoshape(fill='lightblue'),
    alt.Chart(graticule).mark_geoshape(stroke='white', strokeWidth=0.5),
    alt.Chart(source).mark_geoshape(fill='ForestGreen', stroke='black')
).project(
    'naturalEarth1'
).properties(width=600, height=400,title="世界地图").configure_view(stroke=None)

image-20221214171613109

PropertyTypeDescription
arcRectConfigArc-specific Config
areaAreaConfigArea-Specific Config
ariabooleanA boolean flag indicating if ARIA default attributes should be included for marks and guides (SVG output only). If false, the "aria-hidden" attribute will be set for all guides, removing them from the ARIA accessibility tree and Vega-Lite will not generate default descriptions for marks.Default value: true.
autosizeanyOf(AutosizeType, AutoSizeParams)How the visualization size should be determined. If a string, should be one of "pad", "fit" or "none". Object values can additionally specify parameters for content sizing and automatic resizing.Default value: pad
axisAxisConfigAxis configuration, which determines default properties for all x and y axes. For a full list of axis configuration options, please see the corresponding section of the axis documentation.
axisBandAxisConfigConfig for axes with “band” scales.
axisBottomAxisConfigConfig for x-axis along the bottom edge of the chart.
axisDiscreteAxisConfigConfig for axes with “point” or “band” scales.
axisLeftAxisConfigConfig for y-axis along the left edge of the chart.
axisPointAxisConfigConfig for axes with “point” scales.
axisQuantitativeAxisConfigConfig for quantitative axes.
axisRightAxisConfigConfig for y-axis along the right edge of the chart.
axisTemporalAxisConfigConfig for temporal axes.
axisTopAxisConfigConfig for x-axis along the top edge of the chart.
axisXAxisConfigX-axis specific config.
axisXBandAxisConfigConfig for x-axes with “band” scales.
axisXDiscreteAxisConfigConfig for x-axes with “point” or “band” scales.
axisXPointAxisConfigConfig for x-axes with “point” scales.
axisXQuantitativeAxisConfigConfig for x-quantitative axes.
axisXTemporalAxisConfigConfig for x-temporal axes.
axisYAxisConfigY-axis specific config.
axisYBandAxisConfigConfig for y-axes with “band” scales.
axisYDiscreteAxisConfigConfig for y-axes with “point” or “band” scales.
axisYPointAxisConfigConfig for y-axes with “point” scales.
axisYQuantitativeAxisConfigConfig for y-quantitative axes.
axisYTemporalAxisConfigConfig for y-temporal axes.
backgroundanyOf(Color, ExprRef)CSS color property to use as the background of the entire view.Default value: "white"
barBarConfigBar-Specific Config
boxplotBoxPlotConfigBox Config
circleMarkConfigCircle-Specific Config
concatCompositionConfigDefault configuration for all concatenation and repeat view composition operators (concat, hconcat, vconcat, and repeat)
countTitlestringDefault axis and legend title for count fields.Default value: 'Count of Records.
customFormatTypesbooleanAllow the formatType property for text marks and guides to accept a custom formatter function registered as a Vega expression.
errorbandErrorBandConfigErrorBand Config
errorbarErrorBarConfigErrorBar Config
facetCompositionConfigDefault configuration for the facet view composition operator
fieldTitle[‘verbal’, ‘functional’, ‘plain’]Defines how Vega-Lite generates title for fields. There are three possible styles: - "verbal" (Default) - displays function in a verbal style (e.g., “Sum of field”, “Year-month of date”, “field (binned)”). - "function" - displays function using parentheses and capitalized texts (e.g., “SUM(field)”, “YEARMONTH(date)”, “BIN(field)”). - "plain" - displays only the field name without functions (e.g., “field”, “date”, “field”).
fontstringDefault font for all text marks, titles, and labels.
geoshapeMarkConfigGeoshape-Specific Config
headerHeaderConfigHeader configuration, which determines default properties for all headers.For a full list of header configuration options, please see the corresponding section of in the header documentation.
headerColumnHeaderConfigHeader configuration, which determines default properties for column headers.For a full list of header configuration options, please see the corresponding section of in the header documentation.
headerFacetHeaderConfigHeader configuration, which determines default properties for non-row/column facet headers.For a full list of header configuration options, please see the corresponding section of in the header documentation.
headerRowHeaderConfigHeader configuration, which determines default properties for row headers.For a full list of header configuration options, please see the corresponding section of in the header documentation.
imageRectConfigImage-specific Config
legendLegendConfigLegend configuration, which determines default properties for all legends. For a full list of legend configuration options, please see the corresponding section of in the legend documentation.
lineLineConfigLine-Specific Config
lineBreakanyOf(string, ExprRef)A delimiter, such as a newline character, upon which to break text strings into multiple lines. This property provides a global default for text marks, which is overridden by mark or style config settings, and by the lineBreak mark encoding channel. If signal-valued, either string or regular expression (regexp) values are valid.
markMarkConfigMark Config
numberFormatstringD3 Number format for guide labels and text marks. For example "s" for SI units. Use D3’s number format pattern.
paddinganyOf(Padding, ExprRef)The default visualization padding, in pixels, from the edge of the visualization canvas to the data rectangle. If a number, specifies padding for all sides. If an object, the value should have the format {"left": 5, "top": 5, "right": 5, "bottom": 5} to specify padding for each side of the visualization.Default value: 5
paramsarray(Parameter)Dynamic variables that parameterize a visualization.
pointMarkConfigPoint-Specific Config
projectionProjectionConfigProjection configuration, which determines default properties for all projections. For a full list of projection configuration options, please see the corresponding section of the projection documentation.
rangeRangeConfigAn object hash that defines default range arrays or schemes for using with scales. For a full list of scale range configuration options, please see the corresponding section of the scale documentation.
rectRectConfigRect-Specific Config
ruleMarkConfigRule-Specific Config
scaleScaleConfigScale configuration determines default properties for all scales. For a full list of scale configuration options, please see the corresponding section of the scale documentation.
selectionSelectionConfigAn object hash for defining default properties for each type of selections.
squareMarkConfigSquare-Specific Config
styleStyleConfigIndexAn object hash that defines key-value mappings to determine default properties for marks with a given style. The keys represent styles names; the values have to be valid mark configuration objects.
textMarkConfigText-Specific Config
tickTickConfigTick-Specific Config
timeFormatstringDefault time format for raw time values (without time units) in text marks, legend labels and header labels.Default value: "%b %d, %Y" Note: Axes automatically determine the format for each label automatically so this config does not affect axes.
titleTitleConfigTitle configuration, which determines default properties for all titles. For a full list of title configuration options, please see the corresponding section of the title documentation.
trailLineConfigTrail-Specific Config
viewViewConfigDefault properties for single view plots.

优缺点

优点:语法简单,对中文的兼容性好,与r语言的ggplot很类似。

缺点:生成图片不能直接复制,需要保存到本地,这一点不如matplotlib

有兴趣的研究的话:点击此链接

展示一下部分图片

image-20221214172125223

image-20221214172148674

image-20221214172159688

参考:更多内容请点我:https://altair-viz.github.io/gallery/index.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值