matplotlib文档：绘图入门，选择后端，性能优化

最新推荐文章于 2024-07-12 10:04:36 发布

AI路漫漫

最新推荐文章于 2024-07-12 10:04:36 发布

阅读量1.1k

点赞数 2

分类专栏：数据可视化文章标签： python

原文链接：https://matplotlib.org/tutorials/index.html

版权

数据可视化专栏收录该内容

3 篇文章 0 订阅

订阅专栏

用户指南

这个文档包含了一些基本的用法指南和一些实践帮助你学习 matplotlib

一个简单的例子

matplotlib在一个 Figures （windows ，jupyter的小部件）上绘制你的数据，每个图表包含一个或多个 Axes（可以根据 x-y 坐标指定点的区域，或极坐标中的 θ-r ，或3D 图像中的 x-y-z），创建 figure 最简单的方式就是使用 pyplot.subplots，然后使用 Axes.plot 在axes上绘制数据。

figure就是画布，axes 就是画布上的一个个子图

fig, ax = plt.subplots()  # 创建包含一个 子图 的画布
ax.plot([1, 2, 3, 4], [1, 4, 2, 3])  # 在这个子图上绘制数据

很多其他绘图库或语言不需要精确的创建一个子图，MATLAB 中你可以

plot([1, 2, 3, 4], [1, 4, 2, 3])  % MATLAB plot.

但实际上，你可以在matplotlib 中做到，pyplot 中有对应的方法可以直接绘制当前的子图（和figure）

plt.plot([1, 2, 3, 4], [1, 4, 2, 3])  # Matplotlib plot.

Parts of a Figure

更深入的了解一下 Matplotlib figure 组件
在这里插入图片描述

Figure就是整个图像，包含了子图和一些标签啥的。

fig = plt.figure()  # 没有子图的画布
fig, ax = plt.subplots()  # a figure with a single Axes
fig, axs = plt.subplots(2, 2)  # a figure with a 2x2 grid of Axes

方便的构建子图和画布，当然以后也可以再加子图和更复杂的子图版面

Axes 是绘制数据的区域，一个画布可以包含多个子图，一个子图只能在一个画布上。Axes包含两个或三个 Axis对象 (注意与 Axes) 区分，Axis 通过 axes.Axes.set_xlim() / set_ylim() 来控制数据的范围。每个子图都有一个题目（通过 set_title() ) 和 x-y 标签（ set_xlabel() )。

Axes 类和他的成员函数是使用 OO interface（面向对象）的主要接入点

Axis ，这个用来设置图表的范围，生成轴的刻度，标签啥的。刻度的位置由 Locator 对象决定，刻度标签字符通过 Formatter 格式化。Locator and Formatter 的正确组合可以很好的控制刻度的位置和标签。
Artist，画布上所有的东西都是这个，包含 Text objects , Line2D objects ,collections objects, Patch objects … 在子图呈现后，很多 artist 都是依赖于轴的，这些 artiest 不能与多个轴共享，也不可以换轴。

Types of inputs to plotting funcitons

所有的绘图函数都希望 numpy.array or np.ma.masked_array 作为输入。其他类似的，比如 pandas 的对象和 np.matrix 可能就不适合，最后在绘制之前转换为array 对象。

a = pandas.DataFrame(np.random.rand(4, 5), columns = list('abcde'))
a_asarray = a.values 

b = np.matrix([[1, 2], [3, 4]])
b_asarray = np.asarray(b)

The object-oriented interface and the pyplot interface

面向对象的接口和 pyplot接口就是两种基本的使用方式。

显示的创建画布和子图，然后通过他们调用方法（面向对象）
依靠 pyplot 自动的创建管理画布和子图，然后使用 pyplot 的函数来绘制。

x = np.linspace(0, 2, 100)

# Note that even in the OO-style, we use `.pyplot.figure` to create the figure.
fig, ax = plt.subplots()  # 只有一个子图
ax.plot(x, x, label='linear')  # Plot some data on the axes.
ax.plot(x, x**2, label='quadratic')  # Plot more data on the axes...
ax.plot(x, x**3, label='cubic')  # ... and some more.
ax.set_xlabel('x label')  # Add an x-label to the axes.
ax.set_ylabel('y label')  # Add a y-label to the axes.
ax.set_title("Simple Plot")  # Add a title to the axes.
ax.legend()  # Add a legend.



plt.plot(x, x, label='linear')  # 在隐含的子图上绘制
plt.plot(x, x**2, label='quadratic')  # etc.
plt.plot(x, x**3, label='cubic')
plt.xlabel('x label')
plt.ylabel('y label')
plt.title("Simple Plot")
plt.legend()

还有第三种方法，就是嵌入GUI 里的。脑补链接，对于这两种方法，应该选择一种然后坚持使用，防止混淆。建议在交互式绘图中（jupyter）可以使用pyplot 向的，在非交互（大项目中的函数和脚本）可以使用面向对象的。

在需要一遍一遍的使用不同数据时就可以写一个专门的函数来绘制：

def my_plotter(ax, data1, data2, param_dict):
    """
    A helper function to make a graph

    Parameters
    ----------
    ax : Axes
        The axes to draw to

    data1 : array
       The x data

    data2 : array
       The y data

    param_dict : dict
       Dictionary of kwargs to pass to ax.plot

    Returns
    -------
    out : list
        list of artists added
    """
    out = ax.plot(data1, data2, **param_dict)
    return out
    
data1, data2, data3, data4 = np.random.randn(4, 100)
fig, ax = plt.subplots(1, 1)
my_plotter(ax, data1, data2, {'marker': 'x'})

fig, (ax1, ax2) = plt.subplots(1, 2)
my_plotter(ax1, data1, data2, {'marker': 'x'})
my_plotter(ax2, data3, data4, {'marker': 'o'})

在这里插入图片描述
这些简单的例子可能看起来没有必要，但对于复杂的图表就很有用了。

Backends

啥是后端，matplotlib 有很多用法和格式，有些人在pythonshell 中交互的使用，在输入命令就可以弹出绘图窗口，一些人使用jupyter notebook用来快速的数据分析，还有嵌入图形化用户接口来构造丰富的用户程序，，，，为了支持这些例子，matplotlib 可以有不同的输出，这些不同就是后端。前端是面向用户的代码，即绘图代码，而后端就是幕后就是制作图形的艰苦工作。

有两种后端：用户接口后端（pygtk ,tkinter, qt4, 也叫交互式后端），和 hardcopy backends 来制作图片文件（PNG, SVG ,PDF, PS ,也叫非交互式后端）。

选择后端：三种方法 1. matplotlibrc 文件中的 rcParams[“backend”](默认 agg）参数 2. MPLBACKEND 环境变量 3. matplotlib.use() 函数（这个最优先）

# matplotlibrc  文件
backend : qt5agg   # use pyqt5 with antigrain (agg) rendering

# win，环境变量会覆盖 文件中设置的，
> set MPLBACKEND=qt5agg
> python simple_plot.py

# linux 设置环境变量，或给单个标本设置
> export MPLBACKEND=qt5agg
> python simple_plot.py

> MPLBACKEND=qt5agg python simple_plot.py

import matplotlib    # 在创建figure 前使用，
matplotlib.use('qt5agg')

# jupyter 中
%matplotlib qt5/tk/wx 啥的

内置的后端：一般不用管，就是如果在linux下报错，可以下个 Python-tk 包。%matplotlib notebook 这个就是将交互式（可以放大，移动啥的）的 figure 嵌入 Jupyter notbook 中。%matplotlib tk.

interactive mode

打开交互式 matplotlib.interactive()，并可以通过查询它的值matplotlib.is_interactive() or matplotlib.pyplot.ion(), and turned off via matplotlib.pyplot.ioff().

import matplotlib.pyplot as plt
plt.ion()
plt.plot([1.6, 2.7])
# 这时会弹出一个窗口，而且命令行中还可以输入命令
plt.title("interactive test")
plt.xlabel("index")
# 如果改成面向对象的接口，对应的窗口也会改变
ax = plt.gca()
ax.plot([3.1, 2.2])
# 如果没有出现一条新的线就是 mat的版本太老或使用了 macosx 啥的后端。
plt.draw() 

plt.ioff()   # 使用非交互的命令
plt.plot([1.6, 2.7])
plt.show()    # 使用这个显示图表，而且命令行中无法在输入其他命令了

plt.ioff()
for i in range(3):
    plt.plot(np.random.rand(10))
    plt.show()   # show() 可以在一个脚本中多次应用。一次显示一个

性能

在探索数据或编程式的保存大量的图，渲染器的性能就是一个瓶颈。可以使用一些方法通过稍微改变绘制的外观来减少渲染时间，使用的方法取决于创建图表的种类。

线段简化

对于典型的线图，多边形的轮廓这样有线段的，可以通过 rcRarams["path.simplify"] 默认True 和 rcParams["path.simplify_threshold"] 默认 0.1111111111 来控制渲染器的性能。也可以在 matplotlibrc 文件中修改。rcParams['path.simplify"] 一个布尔值控制是否对线段简化。后面的就是简化程度，越高渲染的就越快。

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

# Setup, and create the data to plot
y = np.random.rand(100000)
y[50000:] *= 2
y[np.geomspace(10, 50000, 400).astype(int)] = -1  # 效果跟logspace一样就是包含了两个端点
mpl.rcParams['path.simplify'] = True

mpl.rcParams['path.simplify_threshold'] = 0.0
plt.plot(y)
plt.show()

mpl.rcParams['path.simplify_threshold'] = 1.0    # 阈值是 0 - 1
plt.plot(y)
plt.show()

在这里插入图片描述
额这有啥区别吗，，这个过程就是将这些线段迭代合并到一个向量中，直到下一个线段的垂直距离（根据图标的坐标）大于阈值就不再合并。

Marker simplification

Markers 也可以简化，但只适用于 Line2D 对象，通过 matplotlib.pyplot.plot() and matplotlib.axes.Axes.plot() 的 markevery 参数。这个参数让图标进行二次采样或尝试进行均匀间隔的采样（沿 x 轴）。

plt.plot(x, y, markevery=10)

将线分为小块

如果使用的是 Agg后端，你可以使用 rcParams[“agg.path.chunksize”] 默认0，指定一个块的大小，当线的顶点大于，就分成多个线段。对于某些数据，将线段分成合理的块可以大大减少渲染时间。

mpl.rcParams['agg.path.chunksize'] = 0
mpl.rcParams['agg.path.chunksize'] = 10000    # 没看出啥区别啊。。

更方便的

使用 fast style 可以自动设置简化的阈值和分块参数，来加快绘制大量数据的速度，是不是很方便

import matplotlib.style as mplstyle
mplstyle.use('fast')

AI路漫漫

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录