使用einops进行pytorch张量变换

最新推荐文章于 2024-12-28 17:28:49 发布

阿委困的不能行

最新推荐文章于 2024-12-28 17:28:49 发布

阅读量628

点赞数

文章标签： pytorch 深度学习 python

原文链接：https://blog.csdn.net/csdn_yi_e/article/details/109143580

版权

本文介绍了Einops库，它通过简洁的张量操作符帮助开发者编写更具语义、可读性强且跨框架的代码。通过实例演示了如何利用Einops进行张量重组、操作和层设计，强调了其在提供代码清晰度、检查和一致性方面的优势。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

einops gif

einops

通过灵活而强大的张量操作符为你提供易读并可靠的代码。
支持 numpy、pytorch、tensorflow 以及其他框架

In case you need convincing arguments for setting aside time to learn about einsum and einops…
Tim Rocktäschel, FAIR
如果你还需要什么来说服自己专门花点时间来学einsum和einops的话…

Writing better code with PyTorch and einops 👌
Andrej Karpathy, AI at Tesla
用PyTorch和einops来写出更好的代码👌

Slowly but surely, einops is seeping in to every nook and cranny of my code. If you find yourself shuffling around bazillion dimensional tensors, this might change your life
Nasim Rahaman, MILA (Montreal)
einops正在缓慢而有力地渗入我代码的每一个角落和缝隙。如果你发现自己困扰于一堆高维的张量，这可能会改变你的生活。

教程 / 文档

最简单地了解einops在实际应用中的效果的方式就是看教程。（目前这些教程也作为文档使用。）

第一部分: einops 基础
第二部分: 在深度学习中使用einops
第三部分: 用einops优化实际代码 (暂时仅以pytorch为例)

安装

非常简单:

pip install einops

API

einops 以最简的形式提供了强大的接口.

三个操作符
(einops 教程展示了如何通过这三个方法如何来起到 stacking, reshape, transposition, squeeze/unsqueeze, repeat, tile, concatenate, view 以及各种reduction操作的效果)

from einops import rearrange, reduce, repeat
# 按给出的模式重组张量
output_tensor = rearrange(input_tensor, 't b c -> b c t')
# 结合重组（rearrange）和reduction操作
output_tensor = reduce(input_tensor, 'b c (h h2) (w w2) -> b h w c', 'mean', h2=2, w2=2)
# 沿着某一维复制
output_tensor = repeat(input_tensor, 'h w -> h w c', c=3)

以及两个对应的einops层
(einops keeps a separate version for each framework)

from einops.layers.chainer import Rearrange, Reduce
from einops.layers.gluon import Rearrange, Reduce
from einops.layers.keras import Rearrange, Reduce
from einops.layers.torch import Rearrange, Reduce
from einops.layers.tensorflow import Rearrange, Reduce

这些层和操作共享了相同的功能和参数
(除了第一个参数（具体张量）是在调用时提供的以外)

layer = Rearrange(pattern, **axes_lengths)
layer = Reduce(pattern, reduction, **axes_lengths)

# apply created layer to a tensor / variable
x = layer(x)

一个使用einops层的例子:

# example given for pytorch, but code in other frameworks is almost identical  
from torch.nn import Sequential, Conv2d, MaxPool2d, Linear, ReLU
from einops.layers.torch import Rearrange

model = Sequential(
Conv2d(3, 6, kernel_size=5),
MaxPool2d(kernel_size=2),
Conv2d(6, 16, kernel_size=5),
MaxPool2d(kernel_size=2),
# flattening
Rearrange(‘b c h w -> b (c h w)’),
Linear(1655, 120),
ReLU(),
Linear(120, 10),
)

命名

einops的意思是灵感来自爱因斯坦记法的操作符。

einops stands for Einstein-Inspired Notation for operations
(though “Einstein operations” is more attractive and easier to remember).

Notation was loosely inspired by Einstein summation (in particular by numpy.einsum operation).

为什么要用`einops`?!

提供语义信息

y = x.view(x.shape[0], -1)
y = rearrange(x, 'b c h w -> b (c h w)')

虽然上面两行代码做的事情差不多是一样的，但第二行的代码提供了有关输入和输出的信息。
换句话说，einops专注于接口——输入和输出，而不是结果是如何计算的。

下面这行代码和上面的很相似:

y = rearrange(x, 'time c h w -> time (c h w)')

但是它提供了一个暗示: 当前在处理的数据并不是一批图片，而是一个序列（视频）。

语义信息使得代码更易阅读和维护

对输出的严格定义

下面有两种将张量深度转换为广度(depth-to-space)的方式

# depth-to-space
rearrange(x, 'b c (h h2) (w w2) -> b (c h2 w2) h w', h2=2, w2=2)
rearrange(x, 'b c (h h2) (w w2) -> b (h2 w2 c) h w', h2=2, w2=2)

并且我们至少还有其他的四种方式来进行这种“深度-广度”的转换。哪一种是被框架使用的呢？

这些细节往往会被忽略，因为一般情况下，这些做法不会有什么区别。
但是有时这些细节能有很大的影响（例如使用分组卷积的时候）。
所以你会希望可以在自己代码里讲清楚这个操作。

一致性

reduce(x, 'b c (x dx) -> b c x', 'max', dx=2)
reduce(x, 'b c (x dx) (y dy) -> b c x y', 'max', dx=2, dy=3)
reduce(x, 'b c (x dx) (y dy) (z dz)-> b c x y z', 'max', dx=2, dy=3, dz=4)

上面这些例子展示了无论是几维的张量池化，我们都使用一致的操作，而不会因为张量维度的改变而有不同接口。

广度-深度 或者 深度-广度 的转化在许多框架中都有定义，那 宽度-高度 呢？

rearrange(x, 'b c h (w w2) -> b c (h w2) w', w2=2)

与具体框架无关的行为表现

即使是很简单的函数在不同的框架里也往往有不同的写法。

y = x.flatten() # 或者 flatten(x)

假设张量x的形状(shape)是(3,4,5)，那么y的形状可能是：

在numpy, cupy, chainer, pytorch中: (60,)
在keras, tensorflow.layers, mxnet 和 gluon中: (3, 20)

与框架使用的具体术语无关

举个栗子：tail和repeat常常会令人困扰。当你要沿着宽度复制图片时，你要：

np.tile(image, (1, 2))    # 在numpy中
image.repeat(1, 2)        # pytorch的repeat ≈ numpy的tile

而使用einops的话，你甚至不需要研究要哪个维度的数据被复制了：

repeat(image, 'h w -> h (tile w)', tile=2)  # in numpy
repeat(image, 'h w -> h (tile w)', tile=2)  # in pytorch
repeat(image, 'h w -> h (tile w)', tile=2)  # in tf
repeat(image, 'h w -> h (tile w)', tile=2)  # in jax
repeat(image, 'h w -> h (tile w)', tile=2)  # in mxnet
... (etc.)