einops库梳理

通过不停的对 GPT 提问,结合源码,可以以怎样的学习体验呢?

Overview

einops概况

直观的有个印象,einops在做什么

  1. einops 是一个在处理多维数组(例如在深度学习中常用的张量)时非常有用的 Python 库。它提供了一种简洁且统一的表达方式来重排、拆分和合并张量的维度。使用 einops 可以让代码更加易于理解和维护。
  2. 安装 einops:pip install einops
  3. 链接
    1. 官网:https://einops.rocks/
    2. medium:Einops in 30 seconds
    3. https://blog.csdn.net/angel_hben/article/details/129079635
  4. 应用场景
    1. 维度重排:在卷积神经网络(CNN)、循环神经网络(RNN)和变压器(Transformer)模型中,经常需要重新排列张量的维度以匹配层之间的输入和输出要求。【这个是我用到的,把一个图片分成小块放到Transformer里边
    2. 维度缩减:在池化层或全局平均池化中,减少张量的维度,例如从 (batch_size, height, width, channels) 到 (batch_size, channels)。
    3. 维度扩展:在数据增强或某些层中,可能需要增加额外的维度,例如将 (batch_size, channels) 扩展为 (batch_size, 1, 1, channels)。【这么看确实比使用 pytorch 更直观】
    4. … and so on

einops内容

直击源码,打通所有的代码框架

  1. 所有内容都有什么
    (Pytorch2) (base) kimshan@MacBook-Pro einops % ls
    __init__.py             experimental
    __pycache__             layers
    _backends.py            packing.py
    _torch_specific.py      parsing.py
    array_api.py            py.typed
    einops.py
    
  2. 直接使用的API:__init__.py
    1. einops.py:(核心)张量的高级操作
      1. rearrange: 用于重新排列和重塑张量的维度,以匹配所需的格式。
      2. reduce: 通过指定的操作(如求和、平均等)减少张量的维度。
      3. repeat: 在指定的维度上复制张量,以增加其大小。
      4. einsum: 提供了一种简洁的方式来执行张量之间的逐元素乘法和累加操作。
      5. parse_shape: 解析和计算张量的形状,特别是在形状包含表达式或未知维度时。
      6. asnumpy: 将张量转换为 NumPy 数组,以便使用 NumPy 提供的功能。
      7. EinopsError: 在执行 einops 操作时抛出的异常,用于指示发生了错误。
    2. packing.py:用于处理张量的打包和解包
      1. pack: 将多个张量打包成一个更大的张量,通常用于批处理操作。
      2. unpack: 将一个打包的张量解包回多个单独的张量。
  3. 为各种框架提供实现:layers文件夹
    1. __init__.py:提供了底层的 class RearrangeMixin,class ReduceMixin
    2. _einmix.py:提供了一个混合器类,用于在深度学习框架中实现 EinMix 层,该层可以简化多维张量的混合操作,并自动处理权重和偏差张量的初始化和管理。这个类需要在特定的深度学习框架中进一步实现。
    3. flax.py:为 flax 框架提供了三个自定义层:Rearrange、Reduce 和 EinMix
    4. oneflow.py:为 oneflow 框架提供了三个自定义层
    5. tensorflow.py:为 tensorflow 框架提供了三个自定义层
    6. chainer.py:为 chainer 框架提供了三个自定义层
    7. keras.py:为 keras 框架提供了三个自定义层
    8. paddle.py:为 paddle 框架提供了三个自定义层
    9. torch.py:为 pytorch 框架提供了三个自定义层
  4. 支持作用的底层操作
    1. parsing.py:这些类和方法是 einops 库内部用于解析和处理张量操作表达式的工具。它们使得用户能够使用简洁的字符串表达式来指定复杂的张量操作,例如重新排列、重塑和组合轴。
    2. array_api.py:一个说明文件,和einops.py很多代码相同。
    3. _torch_specific.py:支持与 PyTorch 的 JIT 编译兼容。
    4. _backends.py:各种实现的后端。【我感觉这个才是这个库的精髓,他把不同的框架的 tensor 都抽象起来了,在实现具体功能的时候各种逻辑只需要写一遍】
  5. 日后再加入的功能:experimental文件夹

APIs

经过以上分析,这个库实现了对多个框架的,函数或者模块化的高级张量操作。
下面我把源码中的文档字符串粘贴进来,然后如果有不容易懂得再展开说。

reduce

"""
einops.reduce provides combination of reordering and reduction using reader-friendly notation.

Examples for reduce operation:

```python
>>> x = np.random.randn(100, 32, 64)

# perform max-reduction on the first axis
>>> y = reduce(x, 't b c -> b c', 'max')

# same as previous, but with clearer axes meaning
>>> y = reduce(x, 'time batch channel -> batch channel', 'max')

>>> x = np.random.randn(10, 20, 30, 40)

# 2d max-pooling with kernel size = 2 * 2 for image processing
>>> y1 = reduce(x, 'b c (h1 h2) (w1 w2) -> b c h1 w1', 'max', h2=2, w2=2)

# if one wants to go back to the original height and width, depth-to-space trick can be applied
>>> y2 = rearrange(y1, 'b (c h2 w2) h1 w1 -> b c (h1 h2) (w1 w2)', h2=2, w2=2)
>>> assert parse_shape(x, 'b _ h w') == parse_shape(y2, 'b _ h w')

# Adaptive 2d max-pooling to 3 * 4 grid
>>> reduce(x, 'b c (h1 h2) (w1 w2) -> b c h1 w1', 'max', h1=3, w1=4).shape
(10, 20, 3, 4)

# Global average pooling
>>> reduce(x, 'b c h w -> b c', 'mean').shape
(10, 20)

# Subtracting mean over batch for each channel
>>> y = x - reduce(x, 'b c h w -> () c () ()', 'mean')

# Subtracting per-image mean for each channel
>>> y = x - reduce(x, 'b c h w -> b c () ()', 'mean')

```

Parameters:
    tensor: tensor: tensor of any supported library (e.g. numpy.ndarray, tensorflow, pytorch).
        list of tensors is also accepted, those should be of the same type and shape
    pattern: string, reduction pattern
    reduction: one of available reductions ('min', 'max', 'sum', 'mean', 'prod'), case-sensitive
        alternatively, a callable f(tensor, reduced_axes) -> tensor can be provided.
        This allows using various reductions, examples: np.max, tf.reduce_logsumexp, torch.var, etc.
    axes_lengths: any additional specifications for dimensions

Returns:
    tensor of the same type as input
"""
  1. 一个 Demo:求最大值
>>> x = np.random.randn(5,3)
>>> y = reduce(x, 't a -> a', 'max')
>>> x
array([[-0.09399422,  0.98488931, -0.83173863],
       [-0.0140485 ,  0.55658672, -0.11621765],
       [-1.73142355,  0.49761059,  0.31423137],
       [ 0.48302816, -0.68043368,  1.39221535],
       [-0.33247845, -0.58816318, -0.01502158]])
>>> y
array([0.48302816, 0.98488931, 1.39221535])
  1. 上边的例子中,t、a 都是我们取的名字,我们可以自己取更明确的代称
>>> x = np.random.randn(100, 32, 64)
>>> y = reduce(x, 'time batch channel -> batch channel', 'max')
>>> x.shape
(100, 32, 64)
>>> y.shape
(32, 64)
  1. 另一个 Demo,2d max-pooling with kernel size = 2 * 2,用到了倍数缩减某个维度大小
>>> x = np.random.randn(1, 3, 256, 256)
>>> y = reduce(x, 'b c (h1 h2) (w1 w2) -> b c h1 w1', 'max', h2=2, w2=2)
>>> y.shape
(1, 3, 128, 128)
  1. 除了缩减一个维度,我们可以使用()来保留维度
>>> x = np.random.randn(2,3,4)
>>> y = x - reduce(x, 'c w h -> c () ()', 'mean')
>>> reduce(x, 'c w h -> c () ()', 'mean')
array([[[-0.13433713]],

       [[-0.07317865]]])
>>> reduce(x, 'c w h -> c () ()', 'mean').shape
(2, 1, 1)

最后我们进行维度缩减时进行的操作有:‘min’, ‘max’, ‘sum’, ‘mean’, ‘prod’

rearrange

"""
einops.rearrange is a reader-friendly smart element reordering for multidimensional tensors.
This operation includes functionality of transpose (axes permutation), reshape (view), squeeze, unsqueeze,
stack, concatenate and other operations.

Examples for rearrange operation:

```python
# suppose we have a set of 32 images in "h w c" format (height-width-channel)
>>> images = [np.random.randn(30, 40, 3) for _ in range(32)]

# stack along first (batch) axis, output is a single array
>>> rearrange(images, 'b h w c -> b h w c').shape
(32, 30, 40, 3)

# concatenate images along height (vertical axis), 960 = 32 * 30
>>> rearrange(images, 'b h w c -> (b h) w c').shape
(960, 40, 3)

# concatenated images along horizontal axis, 1280 = 32 * 40
>>> rearrange(images, 'b h w c -> h (b w) c').shape
(30, 1280, 3)

# reordered axes to "b c h w" format for deep learning
>>> rearrange(images, 'b h w c -> b c h w').shape
(32, 3, 30, 40)

# flattened each image into a vector, 3600 = 30 * 40 * 3
>>> rearrange(images, 'b h w c -> b (c h w)').shape
(32, 3600)

# split each image into 4 smaller (top-left, top-right, bottom-left, bottom-right), 128 = 32 * 2 * 2
>>> rearrange(images, 'b (h1 h) (w1 w) c -> (b h1 w1) h w c', h1=2, w1=2).shape
(128, 15, 20, 3)

# space-to-depth operation
>>> rearrange(images, 'b (h h1) (w w1) c -> b h w (c h1 w1)', h1=2, w1=2).shape
(32, 15, 20, 12)

```

When composing axes, C-order enumeration used (consecutive elements have different last axis)
Find more examples in einops tutorial.

Parameters:
    tensor: tensor of any supported library (e.g. numpy.ndarray, tensorflow, pytorch).
            list of tensors is also accepted, those should be of the same type and shape
    pattern: string, rearrangement pattern
    axes_lengths: any additional specifications for dimensions

Returns:
    tensor of the same type as input. If possible, a view to the original tensor is returned.

"""
  1. 通过列表得到的维度可以被合并进去
>>> images = [np.random.randn(30, 40, 3) for _ in range(32)]
>>> rearrange(images, 'b h w c -> b h w c').shape
(32, 30, 40, 3)
  1. 可以进行指定维度的转置
>>> images = [np.random.randn(30, 40, 3) for _ in range(32)]
>>> rearrange(images, 'b h w c -> b c h w').shape
(32, 3, 30, 40)
  1. 可以进行指定维度的合并(这里的图片是一个 batch 为 1 的,2x3 大小的灰度图,255 单独算一个颜色)
>>> images = np.array([[[[255, 200, 150],[200, 40, 130]]]])
>>> # 拿到图片 tensor
>>> rearrange(images, 'b c h w -> b c h w').shape
(1, 1, 2, 3)
>>> rearrange(images, 'b c h w -> b c h w')
array([[[[255, 200, 150],
         [200,  40, 130]]]])
>>> # 合并
>>> rearrange(images, 'b c h w -> (b c) h w').shape
(1, 2, 3)
>>> rearrange(images, 'b c h w -> (b c) h w')
array([[[255, 200, 150],
        [200,  40, 130]]])
>>> # 交换顺序结果不一定一样
>>> rearrange(images, 'b c h w -> b c (h w)').shape
(1, 1, 6)
>>> rearrange(images, 'b c h w -> b c (w h)').shape
(1, 1, 6)
>>> rearrange(images, 'b c h w -> b c (h w)') # h 拼在 w 后边,保留 w 完整性
array([[[255, 200, 150, 200,  40, 130]]])
>>> rearrange(images, 'b c h w -> b c (w h)') # w 拼在 h 后边,保留 h 完整性
array([[[255, 200, 200,  40, 150, 130]]])
  1. 图片的flatten,这里就要注意顺序了(这里改成了 bhwc 了,[255, 200, 150]算一个 RGB 颜色)
>>> images = np.array([[[[255, 200, 150],[200, 40, 130]]]])
>>> rearrange(images, 'b h w c -> b (c h w)').shape
(1, 6)
>>> rearrange(images, 'b h w c -> b (c h w)')
array([[255, 200, 200,  40, 150, 130]])
  1. 图片的分块:下边的例子就是根据图片的宽和高,各分两份,然后叠加在 batch 上边。
>>> images = np.array([[[[255, 200, 150, 100],[200, 40, 130, 200]]]])
>>> rearrange(images, 'b c h w -> b c h w')
array([[[[255, 200, 150, 100],
         [200,  40, 130, 200]]]])
>>> rearrange(images, 'b c (h1 h) (w1 w) -> (b h1 w1) c h w', h1=2, w1=2).shape
(4, 1, 1, 2)
>>> rearrange(images, 'b c (h1 h) (w1 w) -> (b h1 w1) c h w', h1=2, w1=2)
array([[[[255, 200]]],
       [[[150, 100]]],
       [[[200,  40]]],
       [[[130, 200]]]])
  1. space-to-depth operation,但我觉得像下采样,然后遍历一下
>>> images = np.array([[[[255, 200, 150, 100, 1, 2],[200, 40, 130, 200, 3, 4],[255, 224, 253, 252, 5, 6], [10, 20, 30, 40, 7, 8]]]])
>>> rearrange(images, 'b c h w -> b c h w')
array([[[[255, 200, 150, 100,   1,   2],
         [200,  40, 130, 200,   3,   4],
         [255, 224, 253, 252,   5,   6],
         [ 10,  20,  30,  40,   7,   8]]]])
>>> rearrange(images, 'b c (h h1) (w w1) -> b (c h1 w1) h w', h1=2, w1=2).shape
(1, 4, 2, 3)
>>> rearrange(images, 'b c (h h1) (w w1) -> b (c h1 w1) h w', h1=2, w1=2)
array([[[[255, 150,   1],
         [255, 253,   5]],

        [[200, 100,   2],
         [224, 252,   6]],

        [[200, 130,   3],
         [ 10,  30,   7]],

        [[ 40, 200,   4],
         [ 20,  40,   8]]]])

repeat

"""
einops.repeat allows reordering elements and repeating them in arbitrary combinations.
This operation includes functionality of repeat, tile, broadcast functions.

Examples for repeat operation:

```python
# a grayscale image (of shape height x width)
>>> image = np.random.randn(30, 40)

# change it to RGB format by repeating in each channel
>>> repeat(image, 'h w -> h w c', c=3).shape
(30, 40, 3)

# repeat image 2 times along height (vertical axis)
>>> repeat(image, 'h w -> (repeat h) w', repeat=2).shape
(60, 40)

# repeat image 2 time along height and 3 times along width
>>> repeat(image, 'h w -> (h2 h) (w3 w)', h2=2, w3=3).shape
(60, 120)

# convert each pixel to a small square 2x2. Upsample image by 2x
>>> repeat(image, 'h w -> (h h2) (w w2)', h2=2, w2=2).shape
(60, 80)

# pixelate image first by downsampling by 2x, then upsampling
>>> downsampled = reduce(image, '(h h2) (w w2) -> h w', 'mean', h2=2, w2=2)
>>> repeat(downsampled, 'h w -> (h h2) (w w2)', h2=2, w2=2).shape
(30, 40)

```

When composing axes, C-order enumeration used (consecutive elements have different last axis)
Find more examples in einops tutorial.

Parameters:
    tensor: tensor of any supported library (e.g. numpy.ndarray, tensorflow, pytorch).
        list of tensors is also accepted, those should be of the same type and shape
    pattern: string, rearrangement pattern
    axes_lengths: any additional specifications for dimensions

Returns:
    Tensor of the same type as input. If possible, a view to the original tensor is returned.

"""
  1. repeat 就是重复,比如把灰度图通过简单的通道堆叠变成 RGB
>>> image = np.random.randn(30, 40)
>>> repeat(image, 'h w -> h w c', c=3).shape
(30, 40, 3)
  1. 更复杂一点,可以实现金字塔算发里边的上采样操作(下采样去用reduce做)
>>> image = np.array([[10,20],[30,40]])
>>> rearrange(image, 'h w -> h w')
array([[10, 20],
       [30, 40]])
>>> repeat(image, 'h w -> (h h2) (w w2)', h2=2, w2=2).shape
(4, 4)
>>> repeat(image, 'h w -> (h h2) (w w2)', h2=2, w2=2)
array([[10, 10, 20, 20],
       [10, 10, 20, 20],
       [30, 30, 40, 40],
       [30, 30, 40, 40]])

parse_shape

"""
Parse a tensor shape to dictionary mapping axes names to their lengths.

```python
# Use underscore to skip the dimension in parsing.
>>> x = np.zeros([2, 3, 5, 7])
>>> parse_shape(x, 'batch _ h w')
{'batch': 2, 'h': 5, 'w': 7}

# `parse_shape` output can be used to specify axes_lengths for other operations:
>>> y = np.zeros([700])
>>> rearrange(y, '(b c h w) -> b c h w', **parse_shape(x, 'b _ h w')).shape
(2, 10, 5, 7)

```

For symbolic frameworks may return symbols, not integers.

Parameters:
    x: tensor of any supported framework
    pattern: str, space separated names for axes, underscore means skip axis

Returns:
    dict, maps axes names to their lengths
"""

用来获取 tensor 形状

>>> x = np.zeros([2, 3, 5, 7])
>>> parse_shape(x, 'batch _ h w')
{'batch': 2, 'h': 5, 'w': 7}

asnumpy

"""
Convert a tensor of an imperative framework (i.e. numpy/cupy/torch/jax/etc.) to `numpy.ndarray`

Parameters:
    tensor: tensor of any known imperative framework

Returns:
    `numpy.ndarray`, converted to numpy
"""

源码中就是return get_backend(tensor).to_numpy(tensor),调用各个框架的 to_numpy

einsum

einops 是一个用于在深度学习中简化和优化张量操作的 Python 库。与 NumPy 的 einsum 类似,einops.einsum 提供了一种使用爱因斯坦求和约定来表示和执行张量操作的方法,但它专为处理深度学习框架中的张量(如 PyTorch、TensorFlow、JAX 等)而设计。

"""
einops.einsum calls einsum operations with einops-style named
axes indexing, computing tensor products with an arbitrary
number of tensors. Unlike typical einsum syntax, here you must
pass tensors first, and then the pattern.

Also, note that rearrange operations such as `"(batch chan) out"`,
or singleton axes `()`, are not currently supported.

Examples:

For a given pattern such as:
```python
>>> x, y, z = np.random.randn(3, 20, 20, 20)
>>> output = einsum(x, y, z, "a b c, c b d, a g k -> a b k")

```
the following formula is computed:
```tex
output[a, b, k] =
    \sum_{c, d, g} x[a, b, c] * y[c, b, d] * z[a, g, k]
```
where the summation over `c`, `d`, and `g` is performed
because those axes names do not appear on the right-hand side.

Let's see some additional examples:
```python
# Filter a set of images:
>>> batched_images = np.random.randn(128, 16, 16)
>>> filters = np.random.randn(16, 16, 30)
>>> result = einsum(batched_images, filters,
...                 "batch h w, h w channel -> batch channel")
>>> result.shape
(128, 30)

# Matrix multiplication, with an unknown input shape:
>>> batch_shape = (50, 30)
>>> data = np.random.randn(*batch_shape, 20)
>>> weights = np.random.randn(10, 20)
>>> result = einsum(weights, data,
...                 "out_dim in_dim, ... in_dim -> ... out_dim")
>>> result.shape
(50, 30, 10)

# Matrix trace on a single tensor:
>>> matrix = np.random.randn(10, 10)
>>> result = einsum(matrix, "i i ->")
>>> result.shape
()

```

Parameters:
    tensors_and_pattern:
        tensors: tensors of any supported library (numpy, tensorflow, pytorch, jax).
        pattern: string, einsum pattern, with commas
            separating specifications for each tensor.
            pattern should be provided after all tensors.

Returns:
    Tensor of the same type as input, after processing with einsum.

"""
  1. 官网的案例:不同张量的合并
>>> x, y, z = np.random.randn(3, 1, 2, 3)
>>> x
array([[[ 0.16705873,  2.51210949,  0.52959118],
        [ 0.58353769, -0.6779666 , -0.13415447]]])
>>> y
array([[[-0.47765094,  0.00385458, -0.14655275],
        [-1.05469934, -0.15484965, -0.15026622]]])
>>> z
array([[[-0.47858963, -0.12515551, -0.18813591],
        [ 1.30117152,  0.02993121,  0.47870161]]])
>>> output = einsum(x, y, z, "a b c, c b d, a g k -> a b k")
>>> output
array([[[-1.63739124,  0.18954883, -0.57838586],
        [ 0.25568409, -0.02959868,  0.09031688]]])

o u t p u t [ a , b , k ] = ∑ c , d , g x [ a , b , c ] ∗ y [ c , b , d ] ∗ z [ a , g , k ] output[a, b, k] = \sum_{c, d, g} x[a, b, c] * y[c, b, d] * z[a, g, k] output[a,b,k]=c,d,gx[a,b,c]y[c,b,d]z[a,g,k]
2. 官网的案例并没有看懂,我们进行一个小实验**(如果你知道 einsum 是什么就可以跳过这一部分)**

  • 首先准备一些数据:
>>> from einops import einsum
>>> import numpy as np
>>> x = np.array([[1,0,0],[0,0,1],[0,1,0]])
>>> y = np.array([[3,4,5],[6,7,8],[3,2,1]])
>>> x
array([[1, 0, 0],
       [0, 0, 1],
       [0, 1, 0]])
>>> y
array([[3, 4, 5],
       [6, 7, 8],
       [3, 2, 1]])
  • einsum 是不是矩阵乘积呢?我们试一下
>>> x@y
array([[3, 4, 5],
       [3, 2, 1],
       [6, 7, 8]])
>>> einsum(x,y,'a b, c d -> a d')
array([[12, 13, 14],
       [12, 13, 14],
       [12, 13, 14]])
  • 发现并不一样!所以 einsum 不是矩阵乘积。再观察,[12 13 14] 分别排列了三行,看起来很有规律。但我们一时想不出 12 13 14 这三个数是怎么来的。继续实验。如果不是’a d’而是’a b’它会输出 x 本身吗?
>>> einsum(x,y,'a b, c d -> a b')
array([[39,  0,  0],
       [ 0,  0, 39],
       [ 0, 39,  0]])
  • 并不会,但是这个 array 看起来也很有规律。三个 39 的位置表现了 x 的信息,那么 39 这个数字本身一定来自于第二个数组 y。
>>> y.sum()
39
  • 原来是把第二个数组 sum 了一下,乘到第一个数组上了。我们分别实验’a b’和’c d’可以发现同样的结论
>>> x
array([[1, 0, 0],
       [0, 0, 1],
       [0, 1, 0]])
>>> einsum(x,y,'a b, c d -> a b')
array([[39,  0,  0],
       [ 0,  0, 39],
       [ 0, 39,  0]])
>>> y.sum()
39
>>> y
array([[3, 4, 5],
       [6, 7, 8],
       [3, 2, 1]])
>>> einsum(x,y,'a b, c d -> c d')
array([[ 9, 12, 15],
       [18, 21, 24],
       [ 9,  6,  3]])
>>> x.sum()
3
  • 再推广一下,可能是我们输入了’a b c d’四个维度,然后把要输出的’a b’(或者’c d’)保留,剩下的’c d’(或者’a b’)进行求和。那么如果是’a d’或者’c d’呢。
>>> einsum(x,y,'a b, c d -> a d')
array([[12, 13, 14],
       [12, 13, 14],
       [12, 13, 14]])
>>> x.sum(axis=1)
array([1, 1, 1])
>>> y.sum(axis=0)
array([12, 13, 14])

>>> einsum(x,y,'a b, c d -> b c')
array([[12, 21,  6],
       [12, 21,  6],
       [12, 21,  6]])
>>> x.sum(axis=0)
array([1, 1, 1])
>>> y.sum(axis=1)
array([12, 21,  6])
  • 这样就解开了上边 [12 13 14] 数字得出的谜团。因为 x 不管沿axis 0 还是 axis 1 求和都是 [1 1 1] 所以比较容易观察规律。就是把输出的维度保留,剩下的维度求和
  • 再看一下顺序对输出的影响,可以发现会有一个转置的效果。
>>> einsum(x,y,'a b, c d -> a d')
ab,cd->ad
array([[12, 13, 14],
       [12, 13, 14],
       [12, 13, 14]])
>>> einsum(x,y,'a b, c d -> d a')
ab,cd->da
array([[12, 12, 12],
       [13, 13, 13],
       [14, 14, 14]])
  • 最后我们就可以解释下面的例子了
>>> x, y, z = np.random.randn(3, 1, 2, 3)
>>> x
array([[[ 0.51613364, -1.64413418,  0.02056939],
        [-0.38257828,  1.15543751, -0.10533596]]])
>>> y
array([[[-0.15940847,  0.00430798,  0.19768969],
        [-0.2401132 ,  1.64144003, -1.79326939]]])
>>> z
array([[[ 0.23757063, -0.64599319, -1.03294572],
        [-0.21582711,  1.29888799, -1.75626082]]])
>>> x_s = x.sum(axis=(1,2),keepdims=True)
>>> y_s = y.sum(axis=(0,2),keepdims=True)
>>> z_s = z.sum(axis=(0,1),keepdims=True)
>>> x_s
array([[[-0.43990787]]])
>>> y_s
array([[[ 0.04258921],
        [-0.39194256]]])
>>> z_s
array([[[ 0.02174352,  0.65289481, -2.78920655]]])
>>> x_s * y_s * z_s
array([[[-4.07371907e-04, -1.22321979e-02,  5.22566975e-02],
        [ 3.74898712e-03,  1.12571219e-01, -4.80911134e-01]]])
>>> einsum(x, y, z, "a b c, d e f, g h i -> a e i")
array([[[-4.07371907e-04, -1.22321979e-02,  5.22566975e-02],
        [ 3.74898712e-03,  1.12571219e-01, -4.80911134e-01]]])
  • 这样的应用案例就是(也是官网的)
>>> batched_images = np.random.randn(128, 16, 16)
>>> filters = np.random.randn(16, 16, 30)
>>> result = einsum(batched_images, filters,
...                 "batch h w, h w channel -> batch channel")
>>> result.shape
(128, 30)

pack

"""
Packs several tensors into one.
See einops tutorial for introduction into packing (and how it replaces stack and concatenation).

Parameters:
    tensors: tensors to be packed, can be of different dimensionality
    pattern: pattern that is shared for all inputs and output, e.g. "i j * k" or "batch seq *"

Returns:
    (packed_tensor, packed_shapes aka PS)

Example:
```python
>>> from numpy import zeros as Z
>>> inputs = [Z([2, 3, 5]), Z([2, 3, 7, 5]), Z([2, 3, 7, 9, 5])]
>>> packed, ps = pack(inputs, 'i j * k')
>>> packed.shape, ps
((2, 3, 71, 5), [(), (7,), (7, 9)])
```

In this example, axes were matched to: i=2, j=3, k=5 based on order (first, second, and last).
All other axes were 'packed' and concatenated.
PS (packed shapes) contains information about axes that were matched to '*' in every input.
Resulting tensor has as many elements as all inputs in total.

Packing can be reversed with unpack, which additionally needs PS (packed shapes) to reconstruct order.

```python
>>> inputs_unpacked = unpack(packed, ps, 'i j * k')
>>> [x.shape for x in inputs_unpacked]
[(2, 3, 5), (2, 3, 7, 5), (2, 3, 7, 9, 5)]
```

Read the tutorial for introduction and application scenarios.
"""
  1. 应用,channel 上的堆叠
from einops import pack, unpack

h, w = 100, 200
# image_rgb is 3-dimensional (h, w, 3) and depth is 2-dimensional (h, w)
image_rgb = np.random.random([h, w, 3])
image_depth = np.random.random([h, w])
# but we can stack them
image_rgbd, ps = pack([image_rgb, image_depth], 'h w *')

2.下面这个例子就是,[i, j , *, k],对于第一个 Z([2,3,5]),[2,3,5]就是[i,j,k]。然后
*代表其他的张量多的维度往这里边插。71 = 1 + 7 + 7*9

>>> from numpy import zeros as Z
>>> inputs = [Z([2, 3, 5]), Z([2, 3, 7, 5]), Z([2, 3, 7, 9, 5])]
>>> packed, ps = pack(inputs, 'i j * k')
>>> packed.shape, ps
((2, 3, 71, 5), [(), (7,), (7, 9)])

unpack

"""
Unpacks a single tensor into several by splitting over a selected axes.
See einops tutorial for introduction into packing (and how it replaces stack and concatenation).

Parameters:
    tensor: tensor to be unpacked
    packed_shapes: packed_shapes (aka PS) is a list of shapes that take place of '*' in each output.
        output will contain a single tensor for every provided shape
    pattern: pattern that is shared for input and all outputs, e.g. "i j * k" or "batch seq *",
        where * designates an axis to be unpacked

Returns:
    list of tensors

If framework supports views, results are views to the original tensor.

Example:
```python
>>> from numpy import zeros as Z
>>> inputs = [Z([2, 3, 5]), Z([2, 3, 7, 5]), Z([2, 3, 7, 9, 5])]
>>> packed, ps = pack(inputs, 'i j * k')
>>> packed.shape, ps
((2, 3, 71, 5), [(), (7,), (7, 9)])
```

In this example, axes were matched to: i=2, j=3, k=5 based on order (first, second, and last).
All other axes were 'packed' and concatenated.
PS (packed shapes) contains information about axes that were matched to '*' in every input.
Resulting tensor has as many elements as all inputs in total.

Packing can be reversed with unpack, which additionally needs PS (packed shapes) to reconstruct order.

```python
>>> inputs_unpacked = unpack(packed, ps, 'i j * k')
>>> [x.shape for x in inputs_unpacked]
[(2, 3, 5), (2, 3, 7, 5), (2, 3, 7, 9, 5)]
```

Read the tutorial for introduction and application scenarios.
"""

pack 是堆叠,unpack 就是分开。还是刚才的例子,我们把一个彩色图片和黑白图片堆叠在一起了。现在要分开。

from einops import pack, unpack

h, w = 100, 200
# image_rgb is 3-dimensional (h, w, 3) and depth is 2-dimensional (h, w)
image_rgb = np.random.random([h, w, 3])
image_depth = np.random.random([h, w])
# but we can stack them
image_rgbd, ps = pack([image_rgb, image_depth], 'h w *')
# then split
unpacked_rgb, unpacked_depth = unpack(image_rgbd, ps, 'h w *')
<think>好的,用户让我详细描述在本地部署满血DeepSeek R1的步骤和方法。首先,我需要明确DeepSeek R1是什么。根据公开信息,它可能是一个AI模型,类似于其他开源的大语言模型,比如LLaMA或ChatGLM。但用户提到的是“满血”版本,可能指完整功能或特定配置的版本。不过,目前并没有官方公开的DeepSeek R1模型,可能用户有误解,或者是指某个特定版本。 接下来,我需要考虑用户的真实需求。用户可能希望本地部署一个强大的AI模型,用于开发或研究。所以即使DeepSeek R1不存在,我可以提供类似模型的部署步骤,比如DeepSeek的其他开源模型,如DeepSeek-MoE或DeepSeek-R1-Lite,或者指导用户如何查找正确的模型信息。 然后,我需要检查可能的步骤。通常,本地部署模型需要:1. 环境准备,如Python、PyTorch、CUDA;2. 获取模型权重;3. 下载代码;4. 安装依赖;5. 配置参数;6. 运行推理。但如果没有官方发布的模型,这些步骤可能不适用。 用户可能遇到的困难包括找不到模型文件,依赖问题,或者硬件不符。需要提醒用户注意这些点,并建议验证模型来源,确保合规合法。 另外,用户可能混淆了模型名称,需要确认正确的模型名称和版本。例如,DeepSeek有开源的DeepSeek-R1-Lite-Preview,可能需要指导用户到官方网站或GitHub查找。 总结来说,我需要先告知用户DeepSeek R1可能未公开,并提供一般部署流程,同时建议核实模型信息,或者转向其他类似模型的部署指南。同时,强调硬件要求和依赖管理的重要性,确保用户顺利部署。</think>根据官方公开信息显示,截至2024年8月,DeepSeek R1(或相关名称的满血版本)尚未正式开源发布。不过基于行业通用的AI模型部署经验,我可以为您梳理典型的本地部署流程框架(以假设该模型未来开源为前提): ------------------- ### 基础准备阶段 1. **硬件要求验证** - GPU:推荐NVIDIA RTX 3090/4090或A100(显存≥24GB) - 内存:建议64GB DDR4及以上 - 存储:准备≥500GB NVMe SSD(模型权重通常需要100GB+空间) 2. **系统环境配置** ```bash # Ubuntu 22.04 LTS示例 sudo apt install python3.10-venv git-lfs nvidia-cuda-toolkit python3 -m venv deepseek-env source deepseek-env/bin/activate ``` ------------------- ### 依赖安装环节 3. **框架与工具链安装** ```bash pip install torch==2.1.0+cu121 --extra-index-url https://download.pytorch.org/whl/cu121 pip install transformers>=4.34 accelerate sentencepiece einops ``` 4. **模型获取与验证** ```bash git clone https://github.com/deepseek-ai/DeepSeek-R1-Official # 假设的官方仓 huggingface-cli download deepseek/r1-full-weights --token YOUR_TOKEN sha256sum model.bin # 需与官方提供的校验码比对 ``` ------------------- ### 部署执行阶段 5. **推理服务启动** ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "./model_storage", device_map="auto", torch_dtype=torch.bfloat16 ) tokenizer = AutoTokenizer.from_pretrained("./model_storage") ``` 6. **API服务封装(FastAPI示例)** ```python @app.post("/generate") async def generate_text(prompt: str): inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=200) return {"response": tokenizer.decode(outputs[0])} ``` ------------------- ### 重要注意事项 1. 当前建议优先使用DeepSeek官方发布的**可商用模型**(如DeepSeek-R1-Lite-Preview) 2. 若需部署完整版模型,必须严格遵守: - 模型授权协议审查(检查LICENSE文件) - 合规性审查(部分模型禁止商业用途) 3. 推荐使用vLLM等推理加速框架优化吞吐量: ```bash pip install vllm python -m vllm.entrypoints.api_server --model deepseek/r1-full ``` 建议通过DeepSeek官网(https://deepseek.com)获取最新部署指南和技术支持,实际部署时需根据具体发布的模型版本调整参数配置。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值