MindSpore和PyTorch API映射（昇腾AI创新大赛2022-昇思赛道参赛踩坑记录）

不知道叫什么丸

已于 2022-07-31 00:18:57 修改

阅读量1k

点赞数 1

分类专栏： MindSpore 文章标签： pytorch 深度学习 python

于 2022-07-30 23:37:29 首次发布

本文链接：https://blog.csdn.net/wan_15/article/details/126079277

版权

MindSpore 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

MindSpore和PyTorch API映射

1、nn.Conv2d
2、nn.Dense & nn.Linear
3、nn.Dropout
4、nn.BatchNorm2d
- 【!】nn.SyncBatchNorm & nn.BatchNorm2d
5、维度变化

本文是我参加昇腾AI创新大赛2022-昇思赛道，使用 Mindspore 复现顶会论文的学习笔记。这是第一篇，主要记录论文复现过程中常用算子如何从 Pytorch 映射到 Mindspore。

Mindspore 其实提供了相关的 API 映射文档，绝大多数的算子映射都能找到相关的说明，写的详细的我就偷懒一下照搬了。

PyTorch APIs和MindSpore APIs之间的映射

（注：本文使用的MindSpore为1.7版本，PyTorch为1.7版本）

1、nn.Conv2d

mindspore.nn.Conv2d

class mindspore.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, pad_mode="same", padding=0, dilation=1,
                          group=1, has_bias=False, weight_init="normal", bias_init="zeros", data_format="NCHW"))

torch.nn.Conv2d

class torch.nn.Conv2d(in_channels: int, out_channels: int, kernel_size: Union[T, Tuple[T, T]], stride: Union[T, Tuple[T, T]] = 1,
                      padding: Union[T, Tuple[T, T]] = 0, dilation: Union[T, Tuple[T, T]] = 1, groups: int = 1, bias: bool = True,
                      padding_mode: str = 'zeros')

主要区别：
PyTorch：默认不对输入进行填充，bias 为 True。

MindSpore：默认对输入进行填充，使输出与输入维度一致，如果不需要 padding，可以将 pad_mode 参数设为 “valid”；如果需要填充，将 pad_mode 参数设为 “pad”。默认 has_bias 为 False。

示例：

import numpy as np

# In MindSpore
import mindspore
from mindspore import Tensor, nn
net = nn.Conv2d(120, 240, 4, stride=2, has_bias=True)
x = Tensor(np.ones([1, 120, 1024, 640]), mindspore.float32)
output = net(x).shape
print(output)
# Out:
# (1, 240, 512, 320)

net = nn.Conv2d(120, 240, 4, stride=2, pad_mode='valid', has_bias=True)
x = Tensor(np.ones([1, 120, 1024, 640]), mindspore.float32)
output = net(x).shape
print(output)
# Out:
# (1, 240, 511, 319)

# In PyTorch
import torch
m = torch.nn.Conv2d(120, 240, 4, stride=2)
input = torch.rand(1, 120, 1024, 640)
output = m(input)
print(output.shape)
# Out：
# torch.Size([1, 240, 511, 319])

2、nn.Dense & nn.Linear

mindspore.nn.Dense

class mindspore.nn.Dense(in_channels, out_channels, weight_init='normal', bias_init='zeros', has_bias=True, activation=None)

torch.nn.Linear

class torch.nn.Linear(in_features: int, out_features: int, bias: bool = True)

示例：

import numpy as np

# In MindSpore, default weight will be initialized through standard normal distribution.
# Default bias will be initialized by zero.
# Default none activation used.
import mindspore
from mindspore import Tensor, nn
input_net = Tensor(np.array([[180, 234, 154], [244, 48, 247]]), mindspore.float32)
net = nn.Dense(3, 4)
output = net(input_net)
print(output.shape)
# Out：
# (2, 4)

# In PyTorch, default weight and bias will be initialized through uniform distribution.
# No parameter to set the activation.
import torch
input_net = torch.Tensor(np.array([[180, 234, 154], [244, 48, 247]]))
net = torch.nn.Linear(3, 4)
output = net(input_net)
print(output.shape)
# Out：
# torch.Size([2, 4])

3、nn.Dropout

mindspore.nn.Dropout

class mindspore.nn.Dropout(keep_prob=0.5, dtype=mstype.float32)

torch.nn.Dropout

class torch.nn.Dropout(p: float = 0.5, inplace: bool = False)

主要区别：
MindSpore：概率值对应 Dropout 算子的属性 keep_prob，是指输入被保留的概率，1-keep_prob 是指输入被置 0 的概率。

PyTorch：概率值分别对应 Dropout算子的属性 p，是指输入被置 0 的概率，与 MindSpore 相反。

示例：

drop_rate = 0.2

# In MindSpore
import mindspore
drop = mindspore.nn.Dropout(keep_prob=1.0-drop_rate)

# In PyTorch
import torch
drop = torch.nn.Dropout(p=drop_rate)

4、nn.BatchNorm2d

mindspore.nn.BatchNorm2d

class mindspore.nn.BatchNorm2d(num_features, eps=1e-5, momentum=0.9, affine=True, gamma_init='ones', beta_init='zeros',
                               moving_mean_init='zeros', moving_var_init='ones', use_batch_statistics=None, data_format='NCHW')

torch.nn.BatchNorm2d

class torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

主要区别：
PyTorch：用于 running_mean 和 running_var 计算的 momentum 参数的默认值为 0.1。

MindSpore：momentum 参数的默认值为 0.9，与 PyTorch 的 momentum 关系为 1-momentum。其中，gamma、beta、moving_mean 和 moving_variance 参数分别对应 PyTorch 的 weight、bias、running_mean 和 running_var 参数。

示例：

import numpy as np

# In MindSpore.
from mindspore import Tensor, nn
net = nn.BatchNorm2d(num_features=2, momentum=0.8)
x = Tensor(np.array([[[[1, 2], [1, 2]], [[3, 4], [3, 4]]]]).astype(np.float32))
output = net(x)
print(output)
# Out:
# [[[[0.999995   1.99999]
#    [0.999995   1.99999]]
#
#   [[2.999985   3.99998]
#    [2.999985   3.99998]]]]


# In PyTorch.
import torch
input_x = torch.tensor(np.array([[[[1, 2], [1, 2]], [[3, 4], [3, 4]]]]).astype(np.float32))
m = torch.nn.BatchNorm2d(2, momentum=0.2)
output = m(input_x)
print(output)
# Out:
# tensor([[[[-1.0000,  1.0000],
#           [-1.0000,  1.0000]],
#
#          [[-1.0000,  1.0000],
#           [-1.0000,  1.0000]]]], grad_fn=<NativeBatchNormBackward>)

【!】nn.SyncBatchNorm & nn.BatchNorm2d

MindSpore 的 Batch Normalization 的实现只对每个设备内的数据进行规范化。

mindspore.nn.SyncBatchNorm 是跨设备同步的 Batch Normalization。

class mindspore.nn.SyncBatchNorm(num_features, eps=1e-5, momentum=0.9, affine=True, gamma_init='ones', beta_init='zeros',
                                 moving_mean_init='zeros', moving_var_init='ones', use_batch_statistics=None, process_groups=None)

示例：

import os
from minspore import nn

# 根据设备数量选择BatchNorm2d或者SynBatchNorm
if os.getenv("DEVICE_TARGET") == "Ascend" and int(os.getenv("DEVICE_NUM")) > 1:
    BatchNorm2d = nn.SyncBatchNorm
else:
    BatchNorm2d = nn.BatchNorm2d

5、维度变化

PyTorch 的 reshape，flatten，transpose，permute 可用 MindSpore 的 reshape, transpose 代替。

MindSpore 的 transpose 要指定全部维度的顺序。

示例：

import numpy as np

# In PyTorch
import torch
x = torch.Tensor(np.ones((2, 3, 4, 5)))
x = x.reshape(6, 4, -1).transpose(1, 2).permute(2, 0, 1).contiguous()
print(x.shape)
# Out:
# torch.Size([4, 6, 5])

# In MindSpore
from mindspore import ops, Tensor
x = Tensor(np.ones((2, 3, 4, 5)))
x = x.reshape(6, 4, -1).transpose(0, 2, 1).transpore(2, 0, 1)
print(x.shape)
# Out:
# (4, 6, 5)

# 或者
x = Tensor(np.ones((2, 3, 4, 5)))
x = ops.Reshape()(x, (6, 4, -1))
x = ops.Transpose()(x, (0, 2, 1))
x = ops.Transpose()(x, (2, 0, 1))
print(x.shape)
# Out:
# (4, 6, 5)