mindspore的控制流好像有点问题，在静态图下，代码中加print的结果和不加有差别

最新推荐文章于 2022-12-22 12:52:23 发布

小乐快乐

最新推荐文章于 2022-12-22 12:52:23 发布

阅读量179

点赞数

文章标签： numpy python 开发语言

本文链接：https://blog.csdn.net/weixin_45666880/article/details/125652868

版权

在静态图下，某几行代码的前面加上print(...)和不加print(...)得到的计算结果不一致。

运行环境： mindspore-1.5.0 cpu版本、windows11、python 3.7.5

【操作步骤&问题现象】

这是我根据pytorch源码自定义的spectral_norm算子，在54行添加 print(weight_mat) 和不添加，得到的计算结果是不一样的。其中添加后的结果和动态图模式下得到的结果一样

import numpy as npimport mindsporefrom mindspore import nn, ops, Tensor, Parameter, context, numpy as msnpfrom mindspore.common.initializer import initializer, Normal



context.set_context(mode=context.GRAPH_MODE, device_target="CPU")

mindspore.set_seed(123)

np.random.seed(123)

class _SpectralNorm(nn.Cell):

    def __init__(

            self,

            weight,

            n_power_iterations: int = 1,

            dim: int = 0,

            eps: float = 1e-12

    ) -> None:

        super(_SpectralNorm, self).__init__()

        self.dim = dim

        self.n_power_iterations = n_power_iterations

        self.eps = eps

        self.expand_dims = ops.ExpandDims()

        self.l2_normalize = ops.L2Normalize(epsilon=self.eps)



        weight_mat = self._reshape_weight_to_matrix(weight)



        h, w = weight_mat.shape

        init_u = initializer(Normal(1.0, 0), [h], mindspore.float32).init_data()

        init_v = initializer(Normal(1.0, 0), [w], mindspore.float32).init_data()

        self.u = Parameter(self.l2_normalize(init_u), requires_grad=False)

        self.v = Parameter(self.l2_normalize(init_v), requires_grad=False)

        self._update_vectors(weight_mat, init=True)



    def _reshape_weight_to_matrix(self, weight):

        weight_mat = weight

        if self.dim != 0:

            list_dim = list(range(1, self.dim + 1))

            list_dim.append(0)

            for i in range(self.dim + 1, weight_mat.ndim):

                list_dim.append(i)



            weight_mat = msnp.moveaxis(weight_mat, list(range(weight_mat.ndim)), list_dim)

        height = weight_mat.shape[0]

        return weight_mat.reshape((height, -1))



    def _update_vectors(self, weight_mat, init=False) -> None:

        for _ in range(self.n_power_iterations):

            self.u = self.l2_normalize(msnp.multi_dot([weight_mat, self.expand_dims(self.v, -1)]).flatten())

            self.v = self.l2_normalize(msnp.multi_dot([weight_mat.T, self.expand_dims(self.u, -1)]).flatten())

        return None



    def construct(self, weight):

        weight_mat = self._reshape_weight_to_matrix(weight)

        print(weight_mat)

        self._update_vectors(weight_mat)

        sigma = ops.tensor_dot(self.u, msnp.multi_dot([weight_mat, self.expand_dims(self.v, -1)]), 1)

        return weight / sigma



spe = _SpectralNorm(Tensor([[1,2],[3,4]], mindspore.float32))

res = spe(Tensor([[1,2],[3,4]], mindspore.float32))print("res: ", res)

【截图信息】

这是在graph模式下，在54行加上print得到的结果：

这是在graph模式下，不在54行加上print得到的结果：

这是pynative模式下得到的结果：

pynative模式下的结果和graph模式下加上print的结果一致，所以加上print的结果应该是正确的。这是控制流的问题吗？

解答：

_update_vectors函数，返回值是None，导致在没有加print的时候，在specialize阶段会被误优化掉，导致_update_vectors中的流程都没有走。本质上应该跟副作用没有关系，只是加了print，没有被误优化掉_update_vectors。建议将_update_vectors函数的返回值不要写成None。

函数返回值是None，而且函数中对参数有更新，则该函数将被误消除，这是不符合预期的。后续针对函数特化时加上条件，对stop_gradient中的函数不做特化，以防止出现误优化的情况。 https://gitee.com/mindspore/mindspore/pulls/27097