【OSPP开源之夏2022】基于昇思MindSpore, 补齐创建Tensor的方法10+

前言

大二下暑假的时候参加了中国科学院OSPP开源之夏活动,作品是在MindSpore开源代码仓提交一个PR,增加创建Tensor的方法。这是我第一次参加正式的开源活动,对开源的热情就此一发不可收拾。

经过项目初次选拔后的两星期左右,我开始着手做本项目。由于之前对深度学习的了解仅限于构建网络进行训练和深度学习库的使用,很少接触深度学习框架一些底层代码的开发,中间也遇到了不少困难,但却能更好地体验开源世界带给我们的乐趣。说干就干,当天晚上就敲了第一行代码,开始开发相关的函数。开发大致完成后后又花费近两周的时间对开发完的函数进行了一些细节上的调整,终于大功告成啦!我在这里总结一下当时的几点想法:一是能够保持一颗年轻好奇的心;二是深刻理解开源的含义;三是掌握一门过硬的编程技巧,遇到不会的问题不要紧,但是最重要的是学习、并在学习的基础上创新。

在那段为之奋斗努力的日子里,我亲眼目睹自己敲出来的代码变得越来越清晰易读,一开始我从简单的函数写起,但“合抱之木,始于毫末”,我始终相信自己一定可以做得更好,慢慢地学会了算子注册、底层代码和不同模块之间的耦合关系、MindSpore的Gitee代码门禁的有关规则、有关文档的格式规范等知识。

很幸运得到了开源项目MindSpore的王东海老师、邵俊淞老师的现场指导,对于项目上的问题和涉及到的知识,我常常在微信群里和王老师、邵老师聊得热火朝天,很多是平时书本里都没涉及到的东西,在老师指导下豁然开朗,茅塞顿开!同时,我也在OSPP活动交流群里遇见了不少优秀的开源工作者和来自五湖四海的优秀同学,大家相互切磋交流,共同进步,度过了让人难忘的一个暑假。

一、项目背景

MindSpore是端边云全场景按需协同的华为自研AI计算框架,提供全场景统一API,为全场景AI的模型开发、模型运行、模型部署提供端到端能力。MindSpore采用端-边-云按需协作分布式架构、微分原生编程新范式以及AI Native新执行模式,实现更好的资源效率、安全可信,同时降低行业AI开发门槛、释放昇腾芯片算力,助力普惠AI。2020年3月28日开源以来,项目获得广泛好评和认可,连续两年为Gitee Top1。
近年来,市面上流行的深度学习框架也得到了快速发展。2017年1月,由Facebook人工智能研究院(FAIR)基于Torch推出了PyTorch。它是一个基于Python的可续计算包,提供两个高级功能:1、具有强大的GPU加速的张量计算(如NumPy)。2、包含自动求导系统的深度神经网络。PyTorch广泛应用于学术领域和尖端技术开发,在2021年的各大顶会中,使用PyTorch的论文数量已经是使用TensorFlow的至少3倍以上,而这一差距还在持续扩大。截至2021年底,提及 TensorFlow 的论文数量从 2018 年的 228 篇略微提升到了 266 篇,Keras 从 42 升到 56,但 Pytorch 的数量从 87 篇提升到了 252 篇。
在这里插入图片描述

图1 PyTorch论文数占PyTorch和TensorFlow总论文数比例

究其原因,主要是PyTorch的三大优点:
• 易于构建大型计算图
• 易于在计算图中进行梯度运算
• 能在 GPU 上高效运行(cuDNN、cuBLA 等)

因此,吸纳PyTorch的诸多优点,编写出符合规范的接口并合入社区代码,具有较大的实际意义。例如,张量是人工智能领域一个很重要的概念,它用于对象之间或者对象和数据之间的模型化表示。在数据处理、模型部署时,输入数据必须为一个张量。在计算机视觉中它尤其关键:渲染到显示屏上的图像都必须借助于一定的张量表示法,才能被理解深度学习框架理解。这就需要深度学习框架提供创建Tensor的方法。本项目以昇思MindSpore为框架,补全MindSpore创建Tensor的方法。

二、项目描述

MindSpore拥有多种常用创建Tensor的方法,但还不够完善,现需要补齐创建Tensor的方法 10+。
参考:

  1. torch.from_numpy
  2. torch.bernoulli
  3. torch.multinomial
  4. torch.poisson
  5. torch.rand_like
  6. torch.randint_like
  7. torch.randn_like
  8. torch.sparse_coo_tensor
  9. torch.as_tensor
  10. torch.as_strided
  11. torch.frombuffer
  12. torch.empty_strided
  13. torch.randperm
    基于MindSpore实现上述方法

三、项目方案

Step 1 初步开发:参考PyTorch的张量创建,完成对应张量创建的方法。

参考PyTorch创建Tensor的过程,根据MindSpore代码规范,分别开发各个Tensor的创建过程,同时编写注释文档和测试用例。
参考:PyTorch documentation — PyTorch 1.12 documentation

1、from_numpy

torch.from_numpy()方法把数组转换成张量,且二者共享内存,对张量进行修改比如重新赋值,那么原始数组也会相应发生改变。但是不能够输入只读性质的np数组。在MindSpore中使用如下方法实现:
说明:
将numpy数组转换为MindSpore张量。
参数:
数组(numpy.array):输入数组。
返回:
张量,具有与输入数组相同的数据类型。

@staticmethod
    def from_numpy(array):
        """
        Convert numpy array to Tensor.
        If the data is not C contiguous, the data will be copied to C contiguous to construct the tensor.
        Otherwise, The tensor will be constructed using this numpy array without copy.
 
        Args:
            array (numpy.array): The input array.
 
        Returns:
            Tensor, has the same data type as input array.
 
        Examples:
            >>> import numpy as np
            >>> from mindspore import Tensor
            >>> x = np.array([1, 2])
            >>> output = Tensor.from_numpy(x)
            >>> print(output)
            [1 2]
        """
        if isinstance(array, np.ndarray) and not array.flags['C_CONTIGUOUS']:
            array = np.ascontiguousarray(array)
 
        return Tensor(Tensor_.from_numpy(array))

2、bernoulli

从伯努利分布中提取二进制随机数(0或1),输入张量应为包含用于绘制二进制随机数的概率的张量。因此,输入中的所有值都必须在以下范围内(0,1)。伯努利试验是单次随机试验,只有"成功(值为1)"或"失败(值为0)"这两种结果,是由瑞士科学家雅各布·伯努利(1654 - 1705)提出来的。输出张量的第i个元素值, 将会以输入张量的第i个概率值等于1。在MindSpore中使用如下函数实现:
参数:
p:概率分布序列。数据将被标准化并调整为和的相应权重。
seed:用于随机生成的种子值。
out:输出张量。
返回:
out(张量),具有与x相同的形状和类型。

def bernoulli_fun( input_p, seed=None, out=None ) -> Tensor:
    r"""
    The `input` tensor should be a tensor containing probabilities to be used for drawing the binary random number. 
    Hence, all values in `input` have to be in the range: 0 \leq \text{input}_i \leq 10≤input*i*≤1.
    The \text{i}^{th}i*t**h* element of the output tensor will draw a value 11 according to the 
    \text{i}^{th}i*t**h* probability value given in `input`.
 
    Args:
        p (Union[Tensor, float, int]): The shape of p need to be broadcast. Data will be normalized and adjust to respective weights of the sum.
        seed (int, optional): The seed value for random generating. Default: None.
        out (Tensor, optional): the output tensor. The returned out tensor only has values 0 or 1 and is of the same shape as input and out can have integral dtype.

    Returns:
        out (Tensor), with the same shape and type as x.
 
    Raises:
        ValueError: If dtype of one of `input_p` is not a number.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore
        >>> import numpy as np
        >>> from mindspore.tensor_fun import bernoulli_fun
        >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int8)
        >>> output = bernoulli_fun(input_x, seed=1)
        >>> print(output)
        [0 1 0]
        >>> input_p = Tensor(np.array([0.0, 1.0, 1.0]), mindspore.float32)
        >>> output = bernoulli_fun(input_x, seed=-1)
        >>> print(output)
        [0 1 1]
    """
    input_p = np.array(input_p)
    np.random.seed(seed if seed is not None and seed >= 0 and seed <= 2**32 - 1 else None)
    shape_ = input_p.shape
    input_p = input_p.reshape(-1)
    Boat=np.arange(len(input_p)) 
    Prob=np.array(input_p) 
    out = []
    for k in range(len(input_p)):
        r=np.random.rand(1)  
        if r <= Prob[k]:
            out.append(1)
        else:
            out.append(0)
    out = np.array(out)
    out.reshape(shape_)
    return Tensor(out)

3、multinomial

参考torch.multinomial(input, num_samples,replacement=False, out=None) → LongTensor,返回一个张量,每行包含从input相应行中定义的多项分布中抽取的num_samples个样本。
多项分布是二项分布的延展,因为一次实验的结果不是两种了,而是多种了(A1、A2、…Ak),同时每种结果都有各自发生的概率(P1、P2、…、Pk),所有结果的发生概率之和为1(P1+P2+…+Pk=1)。而在multinomial中,不需要每行输入值和为1,但是必须非负且总和不能为0。当抽取样本时,依次从左到右排列(第一个样本对应第一列)。
如果输入input是一个向量,输出out也是一个相同长度num_samples的向量。如果输入input是有 m行的矩阵,输出out是形如m×n的矩阵。
如果参数replacement 为 True, 则样本抽取可以重复。否则,一个样本在每行不能被重复抽取。
参数num_samples必须小于input长度(即,input的列数,如果是input是一个矩阵)。在MindSpore中实现如下:
参数:
input_tensor:包含概率的输入张量。
num_samples:要绘制的样本数。
replacement:是否使用替换绘制。
out:输出张量。
seed:设置随机种子(0到232)
返回:
out(张量),形状与num_samples相同,类型与input_Tensor相同。

def multinomial(
     input_tensor, num_samples, replacement=False, out=None, seed = None
):
    r"""
    Indices are ordered from left to right according to when each was sampled (first samples 
    are placed in first column), input_tensor is the probability weight to be processed.
    If input is a vector, out is a vector of size num_samples. If input is a matrix with m rows, 
    out is an matrix of shape (m \times \text{num\_samples})(m×num_samples). If replacement is 
    True, samples are drawn with replacement. If not, they are drawn without replacement, 
    which means that when a sample index is drawn for a row, it cannot be drawn again for that row.
 
    Args:
        input_tensor (Union[Tensor, int, float]): the input tensor containing probabilities.
        num_samples (int): number of samples to draw.
        replacement (bool, optional): whether to draw with replacement or not.
        out (Tensor, optional): the output tensor.
        seed (int): set the random seed (0 to 2**32)
 
    Returns:
        out (Tensor), with the same shape as num_samples and type as input_tensor.
 
    Raises:
        RuntimeError:If the number of samples taken is greater than the total number of samples.
        TypeError: If dtype of the input_tensor is not int or float.
 
 
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore
        >>> import numpy as np
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import multinomial
        >>> input_x = Tensor(np.array([1, 2, 1, 4, 5, 6, 7, 8]), mindspore.int8)
        >>> output = multinomial(input_x, 4, seed = 0)
        >>> print(output)
        [5 7 3 6]
        >>> input_p = Tensor(np.array([0.0, 1.0, 1.0]), mindspore.float32)
        >>> output = multinomial(input_p, 5, replacement = True, seed = 1)
        >>> print(output)
        [1 2 1 1 1]
 
    """
    np.random.seed(seed)
 
    input_tensor = np.array(input_tensor, dtype=numpy.float32)
 
    if input_tensor.ndim > 2:
        raise RuntimeError("prob_dist must be 1 or 2 dim")
    out0 = []
 
    def _multi(_input_tensor, num_samples, replacement = replacement):
        out = []
        if replacement:
            Boat = np.arange(len(_input_tensor))
            Prob = np.array(_input_tensor)
            if sum(Prob) != 1:
                Prob = Prob / sum(Prob)
            n = len(Prob)
            Qu = np.zeros(n + 1)
            Qu[0] = 0
 
            for i in range(n):
                Qu[i + 1] = Qu[i] + Prob[i]  # generate probability interval
            Qu[
                n
            ] = 1.01  # The last value exceeds 1, so that the following random number r can get 1
 
            for k in range(num_samples):
                r = np.random.rand(1)  # generate a [0,1] random variable
                for i in range(n):
                    if r >= Qu[i] and r < Qu[i + 1]:
                        X = Boat[i]
                        out.append(X % len(_input_tensor))
        else:
            if len(_input_tensor) - int((_input_tensor == 0).sum()) < num_samples:
                raise RuntimeError
                return out
            for k in range(num_samples):
                Boat = np.arange(len(_input_tensor))
                # probability distribution
                Prob = np.array(_input_tensor)
                if sum(Prob) != 1:
                    Prob = Prob / sum(Prob)
                n = len(Prob)
                Qu = np.zeros(n + 1)
                Qu[0] = 0
 
                for i in range(n):
                    Qu[i + 1] = Qu[i] + Prob[i]
                Qu[n] = 1.01
                r = np.random.rand(1)
                for i in range(n):
                    if r >= Qu[i] and r < Qu[i + 1]:
                        X = Boat[i]
                        out.append(X)
                        _input_tensor[i] = 0
        return out
 
    if input_tensor.ndim == 1:
        return mindspore.Tensor(_multi(input_tensor, num_samples, replacement = replacement))
    elif input_tensor.ndim == 2:
        for item in input_tensor:
            out0.append(_multi(item, num_samples, replacement = replacement))
        return mindspore.Tensor(out0)

4、poisson

返回与从泊松分布中采样的每个元素大小相同的张量。
泊松分布适合于描述单位时间内随机事件发生的次数的概率分布。如某一服务设施在一定时间内受到的服务请求的次数,电话交换机接到呼叫的次数、汽车站台的候客人数、机器出现的故障数、自然灾害发生的次数、DNA序列的变异数、放射性原子核的衰变数、激光的光子数分布等等
泊松分布的概率质量函数为:

在MindSpore中实现如下:
参数:
input_tensor:输入张量,包含泊松分布中的变量lambda。
seed:设置随机种子(0到232)
返回:
out:具有与input_Tensor相同的形状。

def poisson( input_tensor, seed = None):
    r"""
    Returns a tensor of the same size as `input` with each element sampled from a Poisson 
    distribution with rate parameter given by the corresponding element in `input` i.e.,
    \text{out}_i \sim \text{Poisson}(\text{input}_i)out*i*∼Poisson(input*i*)
 
    Args:
        input_tensor (Union[Tensor, int, float]): the input tensor containing the variable lambda in poisson distribution.
        seed (int, option): set the random seed (0 to 2**32)
 
    Returns:
        out (Union[Tensor, int]), with the same shape as input_tensor.
 
    Raises:
        TypeError: If dtype of the input_tensor is not int or float.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore
        >>> import numpy as np
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import poisson
        >>> input_x = Tensor(np.array([[1, 2, 3], [4, 5, 6]]), mindspore.int8)
        >>> output = poisson(input_x, seed = 0)
        >>> print(output)
            [[2 3 7]
            [1 9 7]]
        >>> input_p = Tensor(np.array([1.0, 2.0, 3.0]), mindspore.float32)
        >>> output = poisson(input_p, seed = 0)
        >>> print(output)
            [2 3 7]
    """
    try:
        input_tensor = tensor.asnumpy(input_tensor)
    except:
        pass
    input_tensor = Tensor(input_tensor)
    shape_ = input_tensor.shape
    input_tensor = input_tensor.reshape(-1)
    np.random.seed(seed)
    return Tensor(np.random.poisson(input_tensor).reshape(shape_))

在这里插入图片描述

图2 Possion 分布

5、rand_like

返回一个与输入张量大小相同的张量,数值大小为0,1区间上的随机数。
在MindSpore中实现如下:
参数:
input_tensor:输入张量。
seed:设置随机种子(0到232)。
返回:
out:形状与input_sensor相同。

def rand_like( input_tensor, seed = None):
    r"""
    Returns a tensor with the same size as input that is filled with 
    random numbers from a uniform distribution on the interval [0, 1)
 
    Args:
        input_tensor (Union[Tensor, int, float]): the input tensor.
        seed (int, option): set the random seed (0 to 2**32).
 
    Returns:
        out (Union[Tensor, float]), with the same shape as input_tensor.
 
    Raises:
        TypeError: If dtype of the input_tensor is not int or float.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore
        >>> import numpy as np
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import rand_like
        >>> input_x = Tensor(np.array([[1, 2, 3, 9],[1, 2, 3, 9]]), mindspore.int8)
        >>> output = rand_like(input_x, seed = 0)
        >>> print(output)
            [[0.5488135  0.71518937 0.60276338 0.54488318]
             [0.4236548  0.64589411 0.43758721 0.891773  ]]
        >>> input_p = Tensor(np.array([1.0, 2.0, 3.0]), mindspore.float32)
        >>> output = rand_like(input_p, seed = 0)
        >>> print(output)
            [0.5488135  0.71518937 0.60276338]
    """
    input_tensor = np.array(input_tensor)
    shape_ = input_tensor.shape
    input_tensor = input_tensor.reshape(-1)
    x = len(input_tensor)
    np.random.seed(seed)
    return Tensor(np.array([np.random.rand(1) for i in range(x)]).reshape(shape_))

6、randint_like

返回一个与输入张量大小相同的张量,数值大小为[low,high)区间上的随机数
如果只输入一个int类型数据,则默认为high,输入两个,则分别为low,high。在MindSpore上实现如下:
参数:
input_tensor:输入的大小将决定输出张量的大小。
low:从分布中提取的最低整数。默认值:0。
hig:高于从分布中提取的最高整数。
seed:设置随机种子(0到232)。
返回:
out:具有与input_Tensor相同的形状。

def randint_like( input_tensor, high, low=0, seed = None):
    r"""
    returns a tensor with the same size as the input tensor, 
    and the numerical value is a random number on the interval [low, high],
    if only one int type data is entered, the default value is high,
    if two integers are entered, they are low and high respectively.
 
    Args:
        input_tensor (Union[Tensor, int, float]): the size of input will determine size of the output tensor.
        low (int, optional) – Lowest integer to be drawn from the distribution. Default: 0.
        high (int) – One above the highest integer to be drawn from the distribution.
        seed (int, optional): set the random seed (0 to 2**32).
 
    Returns:
        out (Union[Tensor, int]), with the same shape as input_tensor.
 
    Raises:
        TypeError: If dtype of the input_tensor is not int or float.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore
        >>> import numpy as np
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import randint_like
        >>> input_x = Tensor(np.array([1., 2., 3., 4., 5.]), mindspore.float32)
        >>> output = randint_like(input_x, 20, seed = 0)
        >>> print(output)
            [12 15  0  3  3]
        >>> output = randint_like(input_x, 20, 100, seed = 0)
        >>> print(output)
            [64 67 84 87 87]
    """
    input_tensor = np.array(input_tensor)
    shape_ = input_tensor.shape
    input_tensor = input_tensor.reshape(-1)
    if low > high:
        high,low = low,high
    x = len(input_tensor)
    np.random.seed(seed)
    return Tensor(np.array([np.random.randint(low,high) for i in range(x)]).reshape(shape_))

7、randn_like

输入一个张量,张量的数据为浮点数,返回一个张量,其大小与用均值为 0 且方差为 1 的正态分布中的随机数填充的张量相同。在MindSpore中实现如下:
参数:
input_tensor:输入的大小将决定输出张量的大小。
seed:设置随机种子(0到232)。
返回:
out:具有与input_Tensor相同的形状。

def randn_like( input_tensor, seed = None):
    r"""
    Returns a tensor with the same size as input that is filled with random 
    numbers from a normal distribution with mean 0 and variance 1. 
 
    Args:
        input_tensor (Union[Tensor, int, float]): the size of input will determine size of the output tensor.
        seed (int, optional): set the random seed (0 to 2**32).
 
    Returns:
        out (Union[Tensor, int]), with the same shape as input_tensor.
 
    Raises:
        TypeError: If dtype of the input_tensor is not int or float.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore
        >>> import numpy as np
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import randn_like
        >>> input_x = Tensor(np.array([1., 2., 3., 4., 5.]), mindspore.float32)
        >>> output = randn_like(input_x, seed = 0)
        >>> print(output)
            [1.7640524 0.4001572 0.978738  2.2408931 1.867558 ]
        >>> input_p = Tensor(np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]), mindspore.int32)
        >>> output = randn_like(input_p, seed = 0)
        >>> print(output)
            [[ 1.7640524   0.4001572   0.978738    2.2408931   1.867558  ]
             [-0.9772779   0.95008844 -0.1513572  -0.10321885  0.41059852]]
    """
    input_tensor = np.array(input_tensor)
    shape_ = input_tensor.shape
    input_tensor = input_tensor.reshape(-1)
    x = len(input_tensor)
    np.random.seed(seed)
    return Tensor([np.random.randn() for i in range(x)]).reshape(shape_)

8、sparse_coo_tensor

indices (array_like): 张量的初始数据。可以是列表、元组、np数组、标量或其他类型。指数是矩阵中非零值的坐标,因此应该是二维的,其中第一个维度是张量维度的数量,第二个维度是非零值的数量。
在MindSpore中实现如下:
参数:
indexs(Tensor):形状为[N,ndims]的二维整数张量,其中N和ndims分别是COOTensor中values的数量和维数。
values(Tensor):任何类型和形状的一维张量[N],它为indexs中的每个元素提供值。
shape(tuple(int)):一个大小为ndims的整数元组,它指定稀疏张量的dense_shape。
coo_tensor(COOTensor):一个COOTensorobject。
返回:
COOTensor,由“indexs”、“values”和“shape”组成。

def sparse_coo_tensor( indices=None, values=None, shape=None, coo_tensor=None):
    r"""
    A sparse representation of a set of nonzero elements from a tensor at given indices.
    For a tensor dense, its COOTensor(indices, values, shape) has
    `dense[indices[i]] = values[i]`.
 
    Args:
        indices (Tensor): A 2-D integer Tensor of shape `[N, ndims]`,
            where N and ndims are the number of `values` and number of dimensions in
            the COOTensor, respectively. Currently, `ndims` must be 2.
            Please make sure that the indices are in range of the given shape.
        values (Tensor): A 1-D tensor of any type and shape `[N]`, which
            supplies the values for each element in `indices`.
        shape (tuple(int)): A integer tuple of size `ndims`,
            which specifies the dense_shape of the sparse tensor.
        coo_tensor (COOTensor): A COOTensor object.
 
    Returns:
        COOTensor, composed of `indices`, `values`, and `shape`.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore as ms
        >>> import mindspore.nn as nn
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import sparse_coo_tensor
        >>> indices = Tensor([[0, 1], [1, 2]], dtype=ms.int32)
        >>> values = Tensor([1, 2], dtype=ms.float32)
        >>> shape = (3, 4)
        >>> x = sparse_coo_tensor(indices, values, shape)
        >>> print(x.values)
        [1. 2.]
        >>> print(x.indices)
        [[0 1]
        [1 2]]
        >>> print(x.shape)
        (3, 4)
 
    """
    return mindspore.COOTensor(indices, values, shape, coo_tensor)

9、as_tensor

将数据转换为mindspore中的张量。
参数:
data(array_like):张量的初始数据。可以是列表、元组、NumPy.ndarray、标量和其他类型。
dtype(mindspore.dtype,可选):返回张量的所需数据类型。
返回:
张量包含数据,数据类型在mindspore中。

def as_tensor( data, dtype=None):
    r"""
    convert data to tensor in mindspore.
 
    Args:
        data (array_like): Initial data for the tensor. Can be a list, tuple, 
                           NumPy ndarray, scalar, and other types.
        dtype (mindspore.dtype, optional): the desired data type of returned tensor. 
                                       Default: if None, infers data type from data.
 
    Returns:
        Tensor contains the data and the dtype is in mindspore.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import mindspore as ms
        >>> import mindspore.nn as nn
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import as_tensor
        >>> input_data = numpy.array([1, 2, 3])
        >>> ms_tensor = as_tensor(a)
        >>> ms_tensor
        Tensor(shape=[3], dtype=Int64, value= [1, 2, 3])
    """
    return mindspore.Tensor(data,dtype=dtype)

10、as_strided

创建张量的视图。创建的张量的多个元素可能引用单个内存位置,具有内存共享机制。在Pytorch中,许多返回张量视图的PyTorch函数都是通过该函数在内部实现的。
stride(元组或整数):输出张量的步幅
subhook(int,可选):输出张量的基础存储中的偏移量
在MindSpore中实现如下:
参数:
x:输入张量。
shape:输出张量的形状
strides:输出张量的步幅
subok:输出张量底层存储中的偏移量
返回:
以对应步幅和偏移量展示的张量。

def as_strided( x, shape=None, strides=None, subok=False, writeable = True):
    r"""
    as_strided(input, size, stride, storage_offset=0) -> Tensor
    Create a view of an existing `mindspore.Tensor` :attr:`x` with specified
    :attr:`shape`, :attr:`stride` and :attr:`subok`.
 
    Args:
        x (Tensor): the input tensor.
        shape (tuple or ints): the shape of the output tensor
        stride (tuple or ints): the stride of the output tensor
        subok (int, optional): the offset in the underlying storage of the output tensor
 
    Returns:
        Tensor viewed by strides and subok.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import numpy as np
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import as_stride
        >>> X = numpy.arange(9, dtype=np.int32).reshape(3,3)
        >>> output = as_strided(X, (2, 2), (1, 1))
        >>> print(output)
        [[0 1]
         [1 2]]
    """
    dtype_ = mindspore.Tensor(x).dtype
    x = np.array(x,dtype=np.float64)
    n = x.strides[1]
    strides = tuple(np.array(strides)*n)
    return mindspore.Tensor(numpy.lib.stride_tricks.as_strided(x, shape, strides, subok, writeable),dtype = dtype_)

11、frombuffer

从Python缓冲区中实现一维张量。
跳过缓冲区中的第一个offset字节,并将其余原始字节解释为类型为 dtype的一维张量,其中包含: count个元素。Count为-1时取所有元素。
返回的张量和缓冲区共享相同的内存。对张量的修改将反映在缓冲区中,反之亦然。在MindSpore中实现如下:
参数:
buffer:缓冲区接口的Python对象。
dtype:返回张量的所需数据类型。
count:要读取的所需元素的数量。如果为负值,将读取所有元素(直到缓冲区结束)。默认值:-1。
offset:缓冲区开始时要跳过的字节数。默认值:0。
返回:
从缓冲区获取的一维张量。

def frombuffer( buffer, dtype=mindspore.float64, count=-1, offset=0):
    r"""
    Creates a 1-dimensional :class:`Tensor` from an object that implements
    the Python buffer protocol.
    Skips the first :attr:`offset` bytes in the buffer, and interprets the rest of
    the raw bytes as a 1-dimensional tensor of type :attr:`dtype` with :attr:`count`
    elements.
 
    Args:
        buffer (object): a Python object that exposes the buffer interface.
        dtype (mindspore.dtype): the desired data type of returned tensor.
        count (int, optional): the number of desired elements to be read. If negative, 
                                all the elements (until the end of the buffer) will be read. Default: -1.
        offset (int, optional): the number of bytes to skip at the start of the buffer. Default: 0.
 
    Returns:
        a 1-dimensional Tensor from an object that implements the Python buffer protocol.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> import numpy as np
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import frombuffer
        >>> from array import array
        >>> input_array = array("d", [1, 2, 3, 4]) 
        >>> input_array
        array('d', [1.0, 2.0, 3.0, 4.0])
        >>> output = frombuffer(input_array, mindspore.int32)
        >>> print(output)
        [1 2 3 4]
    """
    res = numpy.frombuffer(buffer = buffer, dtype = np.float64, count = count, offset = offset)
    result = mindspore.Tensor(res,dtype=dtype)
    return result

12、empty_strided

在MindSpore中实现如下:
参数:
size:输出张量的形状
stride:输出张量的步长
dtype:返回类型
返回:
充满未初始化数据的张量。

def empty_strided( size, stride, dtype = mindspore.float64):
    r"""
    Creates a tensor with the specified :attr:`size` and :attr:`stride` and filled with undefined data.
 
    Args:
        size (tuple of python:ints): the shape of the output tensor.
        stride (tuple of python:ints): the strides of the output tensor.
        dtype (mindspore.dtype, optional): the desired data type of returned tensor.
 
    Returns:
         a tensor with the specified size and stride and filled with undefined data.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import empty_strided
        >>> size = (3, 3)
        >>> stride = (1, 3)
        >>> output = empty_strided(size, stride)
        >>> print(output)
        [[-0.01712059 -0.00691067  0.01395389]
         [-0.00099344 -0.0125051  -0.0113176 ]
         [ 0.00223543  0.00039709  0.        ]]
    """
    tensor_ = mindspore.Tensor(shape=size, dtype=mindspore.float32, init=Normal())
    tensor1_ = tensor_.resize(-1)
    item_ls = as_strided(tensor_, shape=size, strides = stride).resize(-1)
    for i in range(size[0]):
        for j in range(size[1]):
            if tensor_[i][j] not in item_ls:
                tensor_[i][j] = 0.0
    return mindspore.Tensor(tensor_,dtype = dtype)

13、randperm

返回从“0”到“n-1”的整数的随机排列。
在MindSpore中实现如下:
参数:
n:上限值(不包括)。
out:输出Tensor。
返回:
0n-1的整数的随机排列。

def randpermn( n, out=None):
    r"""
    Returns a random permutation of integers from ``0`` to ``n - 1``.
 
    Args:
        n (int): the upper bound (exclusive).
        out (Tensor): the output Tensor.
 
    Returns:
        a random permutation of integers from ``0`` to ``n - 1``.
 
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
 
    Examples:
        >>> from mindspore import Tensor
        >>> from mindspore.tensor_fun import randpermn
        >>> n = 6
        >>> output = randpermn(n)
        >>> print(output)
        [3, 2, 1, 4, 5, 0]
    """
    out = [i for i in range(n)]
    np.random.shuffle(out)
    return Tensor(out)

Step 2 编写UT测试和ST测试以及测试脚本

代码如下:

from tensor_fun import *
import tensor_fun
import numpy as np
import pytest
import logging
from array import array
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
def test_bernoulli_fun():
    """
    test the function of bernoulli_fun.
    """
    logger.info("test_bernoulli_fun")
    input_x = Tensor(np.array([1, 2, 3]), mindspore.int64)
    output = tensor_fun.bernoulli_fun(input_x, seed = 1)
    assert output.shape == (3,)
    assert output.dtype == mindspore.int64
    assert isinstance(output, Tensor)
    
def test_multinomial():
    """
    test the function of multinomial.
    """
    logger.info("test_multinomial")
    input_x = Tensor(np.array([1, 2, 1, 4, 5, 6, 7, 8]), mindspore.int64)
    output = multinomial(input_x, 4, seed = 0)
    assert output.shape == (4,)
    assert output.dtype == mindspore.int64
    assert isinstance(output, Tensor)
    
def test_poisson():
    """
    test the function of poisson.
    """
    logger.info("test_poission")
    input_x = Tensor(np.array([[1, 2, 3], [4, 5, 6]]), mindspore.int64)
    output = poisson(input_x, seed = 0)
    assert output.shape == (2, 3)
    assert output.dtype == mindspore.int64
    assert isinstance(output, Tensor)
    
def test_rand_like():
    """
    test the function of rand_like.
    """
    logger.info("test_rand_like")
    input_x = Tensor(np.array([[1, 2, 3, 9],[1, 2, 3, 9]]), mindspore.int8)
    output = rand_like(input_x, seed = 0)
    assert output.shape == (2, 4)
    assert output.dtype == mindspore.float64
    assert isinstance(output, Tensor)
    
def test_randint_like():
    """
    test the function of randint_like.
    """
    logger.info("test_randint_like")
    input_x = Tensor(np.array([1., 2., 3., 4., 5.]), mindspore.float32)
    output = randint_like(input_x, 20, 100, seed = 0)
    assert output.shape == (5, )
    assert output.dtype == mindspore.int64
    assert isinstance(output, Tensor)
    assert output[output.argmax()] <= 100
    assert output[output.argmin()] >= 20
    
def test_randn_like():
    """
    test the function of randn_like.
    """
    logger.info("test_randn_like")
    input_p = Tensor(np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]), mindspore.int32)
    output = randn_like(input_p, seed = 0)
    assert output.shape == (2, 5)
    assert output.dtype == mindspore.float32
    assert isinstance(output, Tensor)
    
def test_sparse_coo_tensor():
    """
    test the function of sparse_coo_tensor.
    """
    logger.info("test_sparse_coo_tensor")
    indices = Tensor([[0, 1], [1, 2]], dtype=mindspore.int32)
    values = Tensor([1, 2], dtype=mindspore.float32)
    shape = (3, 4)
    output = sparse_coo_tensor(indices, values, shape)
    cmp = Tensor([1,2])
    assert output.shape == (3, 4)
    assert output.dtype == mindspore.float32
    assert (output.values==cmp).sum()==Tensor(2)
    
def test_as_tensor():
    """
    test the function of as_tensor.
    """
    logger.info("test_as_tensor")
    input_data = numpy.array([1, 2, 3])
    ms_tensor = as_tensor(input_data)
    assert ms_tensor.shape == (3, )
    assert ms_tensor.dtype == mindspore.int64
    assert isinstance(ms_tensor, Tensor)
    
def test_as_strided():
    """
    test the function of as_strided.
    """
    logger.info("test_as_strided")
    X = numpy.arange(9, dtype=np.int32).reshape(3,3)
    output = as_strided(X, (2, 2), (1, 1))
    assert output.shape == (2, 2)
    assert output.dtype == mindspore.int32
    assert isinstance(output, Tensor)
    
def test_frombuffer():
    """
    test the function of frombuffer.
    """
    logger.info("test_frombuffer")
    input_array = array("d", [1, 2, 3, 4]) 
    output = frombuffer(input_array, mindspore.int32)
    assert output.shape == (4, )
    assert output.dtype == mindspore.int32
    assert sum(output == Tensor([1, 2, 3, 4]))==Tensor(4)
    assert isinstance(output, Tensor)
 
def test_empty_strided():
    """
    test the function of empty_strided.
    """
    size = (3, 3)
    stride = (1, 3)
    output = empty_strided(size, stride)
    assert output.shape == (3, 3)
    assert output.dtype == mindspore.float64
    
def test_randpermn():
    """
    test the function of randpermn.
    """
    n = 6
    output = randpermn(n)
    for i in [0, 1, 2, 3, 4, 5]:
        assert i in output
    assert output.shape == (6, )
    assert isinstance(output, Tensor)
    
if __name__ == "__main__":
    test_as_strided()
    test_as_tensor()
    test_bernoulli_fun()
    test_empty_strided()
    test_frombuffer()
    test_multinomial()
    test_poisson()
    test_rand_like()
    test_randint_like()
    test_randn_like()
    test_randpermn()
    test_sparse_coo_tensor()
    logger.info("ut test end! all test passed!")

直接运行测试代码:
在这里插入图片描述
图3 运行测试脚本
或者使用pytest:
在这里插入图片描述
图4 使用pytest测试

自行测试时,可以编写st test和对应的sh文件(本次提交的PR不需要sh脚本)
run_tensor_func.py:

# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Test Tensor create functions."""
from mindspore.common.tensor_func import *
import numpy as np
import logging
from array import array
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
 
def run_rand_like():
    """
    run the function of rand_like.
    """
    logger.info("run_rand_like")
    input_x = Tensor(np.array([[1, 2, 3, 9],[1, 2, 3, 9]]), mindspore.int8)
    output = rand_like(input_x, seed = 0)
    expect_res = np.array([[5.48813504e-01, 7.15189366e-01, 6.02763376e-01, 5.44883183e-01],[4.23654799e-01, 6.45894113e-01, 4.37587211e-01, 8.91773001e-01]]).astype(np.float64)
    assert np.allclose(output.asnumpy(), expect_res)
 
 
def run_randint_like():
    """
    run the function of randint_like.
    """
    logger.info("run_randint_like")
    input_x = Tensor(np.array([1., 2., 3., 4., 5.]), mindspore.float32)
    output = randint_like(input_x, 20, 100, seed = 0)
    expect_res = np.array([64, 67, 84, 87, 87])
    assert np.allclose(output.asnumpy(), expect_res)
 
 
def run_randn_like():
    """
    run the function of randn_like.
    """
    logger.info("run_randn_like")
    input_p = Tensor(np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]), mindspore.int32)
    output = randn_like(input_p, seed = 0)
    expect_res = np.array([[1.76405239e+00, 4.00157213e-01, 9.78738010e-01, 2.24089313e+00, 1.86755800e+00],[-9.77277875e-01, 9.50088441e-01, -1.51357204e-01, -1.03218853e-01, 4.10598516e-01]])
    assert np.allclose(output.asnumpy(), expect_res)
 
 
def run_sparse_coo_tensor():
    """
    run the function of sparse_coo_tensor.
    """
    logger.info("run_sparse_coo_tensor")
    indices = Tensor([[0, 1], [1, 2]], dtype=mindspore.int32)
    values = Tensor([1, 2], dtype=mindspore.float32)
    shape = (3, 4)
    output = sparse_coo_tensor(indices, values, shape)
    out = output.values
    expect_res = np.array([1, 2])
    assert np.allclose(out.asnumpy(), expect_res)
 
 
def run_as_tensor():
    """
    run the function of as_tensor.
    """
    logger.info("run_as_tensor")
    input_data = np.array([1, 2, 3])
    ms_tensor = as_tensor(input_data)
    expect_res = np.array([1, 2, 3])
    assert np.allclose(ms_tensor.asnumpy(), expect_res)
 
 
def run_as_strided():
    """
    run the function of as_strided.
    """
    logger.info("run_as_strided")
    X = np.arange(9, dtype=np.int32).reshape(3,3)
    output = as_strided(X, (2, 2), (1, 1))
    expect_res = np.array([[0, 1],[1, 2]]).astype(np.int32)
    assert np.allclose(output.asnumpy(), expect_res)
 
 
def run_frombuffer():
    """
    run the function of frombuffer.
    """
    logger.info("run_frombuffer")
    input_array = array("d", [1, 2, 3, 4]) 
    output = frombuffer(input_array, mindspore.int32)
    expect_res = np.array([1, 2, 3, 4]).astype(np.int32)
    assert np.allclose(output.asnumpy(), expect_res)
 
 
def run_empty_strided():
    """
    run the function of empty_strided.
    """
    size = (3, 3)
    stride = (1, 3)
    output = empty_strided(size, stride)
    expect_res = np.zeros(3).astype(np.float64)
    assert np.allclose(output.asnumpy(), expect_res)
 
 
def run_randpermn():
    """
    run the function of randpermn.
    """
    n = 6
    output = randpermn(n)
    for i in [0, 1, 2, 3, 4, 5]:
        assert i in output
    assert output.shape == (6, )
    assert isinstance(output, Tensor)
 
 
if __name__ == "__main__":
    run_as_strided()
    run_as_tensor()
    run_empty_strided()
    run_frombuffer()
    run_rand_like()
    run_randint_like()
    run_randn_like()
    run_randpermn()
    run_sparse_coo_tensor()

run_tensor_func.sh:

#!/bin/bash
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
set -e
BASE_PATH=$(cd "$(dirname $0)"; pwd)
rm -rf ${BASE_PATH}/tensor_func
mkdir ${BASE_PATH}/tensor_func
unset SLOG_PRINT_TO_STDOUT
cd ${BASE_PATH}/tensor_func
echo "start test tensor functions"
env > env.log
python ../run_tensor_func.py > test_tensor_func.log 2>&1 &
process_pid=`echo $!`
wait ${process_pid}
status=`echo $?`
if [ "${status}" != "0" ]; then
    echo "[ERROR] test failed. status: ${status}"
    exit 1
else
    echo "[INFO] test success."
fi
 
exit 0

Step 3 提交PR并完善内容

最终方案:
https://gitee.com/mindspore/mindspore/pulls/42845
至此,开发工作全部完成。

Step 4 项目成果

本项目实现了为MindSpore补充十余种创建张量的方法。在深度学习领域,张量数据占主导地位。无穷尽的张量数据中存储了世间万物:声音、位置、图像、语言……在神经网络中,张量的连接使得运算结果最终到达对应答案的关键词或者模式识别结论上。本项目囊括了bernoulli、poisson、randperm概率统计运算张量创建方法、sparse_coo_tensor稀疏张量创建方法等多种张量产生方法,为开发者提供了丰富的张量创建接口并经行了封装,有助于开发者在网络构建等运算中轻松对张量进行操作,具有广泛的应用前景。

Step 5 附注:函数差异对比

参考:MindSpore 与 Pytorch差异对比

  1. torch.from_numpy
  2. torch.bernoulli
  3. torch.multinomial
  4. torch.poisson
  5. torch.rand_like
  6. torch.randint_like
  7. torch.randn_like
  8. torch.sparse_coo_tensor
  9. torch.as_tensor
  10. torch.as_strided
  11. torch.frombuffer
  12. torch.empty_strided
  13. torch.randperm

(1) 比较与torch.from_numpy功能差异

torch.from_numpy


torch.from_numpy(ndarray)

更多内容详见torch.from_numpy。
mindspore.Tensor.from_numpy


mindspore.Tensor.from_numpy(ndarray)

接口无差异

(2) 比较与torch.bernoulli功能差异

torch.bernoulli


torch.bernoulli(input, *, generator=None, out=None)

更多内容详见torch.bernoulli

Example:
    >>> a = torch.empty(3, 3).uniform_(0, 1)  # generate a uniform random matrix with range [0, 1]
    >>> a
    tensor([[ 0.1737,  0.0950,  0.3609],
            [ 0.7148,  0.0289,  0.2676],
            [ 0.9456,  0.8937,  0.7202]])
    >>> torch.bernoulli(a)
    tensor([[ 1.,  0.,  0.],
            [ 0.,  0.,  0.],
            [ 1.,  1.,  1.]])
 
    >>> a = torch.ones(3, 3) # probability of drawing "1" is 1
    >>> torch.bernoulli(a)
    tensor([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 1.,  1.,  1.]])
    >>> a = torch.zeros(3, 3) # probability of drawing "1" is 0
    >>> torch.bernoulli(a)
    tensor([[ 0.,  0.,  0.],
            [ 0.,  0.,  0.],
            [ 0.,  0.,  0.]])

mindspore.Tensor.bernoulli

mindspore.Tensor.bernoulli(self, p=0.5, seed=-1)
Example:
    >>> a = numpy.array([1, 2, 3])
    >>> t = torch.from_numpy(a)
    >>> t
    tensor([ 1,  2,  3])
    >>> t[0] = -1
    >>> a
    array([-1,  2,  3])

(3) 比较与torch.multinomial功能差异

torch.multinomial

torch.multinomial(input, num_samples, replacement=False, *, generator=None, out=None)

更多内容详见torch.multinomial。

Example:
    >>> weights = torch.tensor([0, 10, 3, 0], dtype=torch.float) # create a tensor of weights
    >>> torch.multinomial(weights, 2)
    tensor([1, 2])
    >>> torch.multinomial(weights, 4) # ERROR!
    RuntimeError: invalid argument 2: invalid multinomial distribution (with replacement=False,
    not enough non-negative category to sample) at ../aten/src/TH/generic/THTensorRandom.cpp:320
    >>> torch.multinomial(weights, 4, replacement=True)
    tensor([ 2,  1,  1,  1])
mindspore.Tensor.multinomial
mindspore.Tensor.multinomial(self, num_samples, seed=0, seed2=0)
Examples:
    >>> from mindspore import Tensor
    >>> import mindspore
    >>> x = Tensor([0., 9., 4., 0.], mindspore.float32)
    >>> output = x.multinomial(num_samples=2,seed=10)
    >>> print(output)
    [2 1]

(4) 比较与torch.poisson功能差异

更多内容详见torch.poisson。
torch.poisson

poisson(input, generator=None)
Example:
    >>> rates = torch.rand(4, 4) * 5  # rate parameter between 0 and 5
    >>> torch.poisson(rates)
    tensor([[9., 1., 3., 5.],
            [8., 6., 6., 0.],
            [0., 4., 5., 3.],
            [2., 1., 4., 2.]])

mindspore.Tensor.poisson

mindspore.Tensor.poisson(self, shape, seed=0, seed2=0)

Examples:
    >>> shape = (4, 1)
    >>> mean = Tensor(np.array([5.0, 10.0]), mstype.float32)
    >>> output = mean.Poisson(shape, seed=5)
    >>> result = output.shape
    >>> print(result)
    (4, 2)

(5) 比较与torch.rand_like功能差异

更多内容详见torch.rand_like。
torch.rand_like

torch. rand_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format)

mindspore.Tensor.rand_like

mindspore.Tensor.rand_like(self, seed=None)
Examples:
    >>> import mindspore
    >>> import numpy as np
    >>> from mindspore import Tensor
    >>> input_x = Tensor(np.array([[1, 2, 3, 9], [1, 2, 3, 9]]), mindspore.int8)
    >>> output = input_x.rand_like(seed = 0)
    >>> print(output)
        [[0.5488135  0.71518937 0.60276338 0.54488318]
        [0.4236548  0.64589411 0.43758721 0.891773  ]]
    >>> input_p = Tensor(np.array([1.0, 2.0, 3.0]), mindspore.float32)
    >>> output = input_p.rand_like(seed = 0)
    >>> print(output)
        [0.5488135  0.71518937 0.60276338]

(6) 比较与torch.randint_like功能差异

更多内容详见torch.randint_like。
torch.randint_like

randint_like(input, low=0, high, \*, dtype=None, layout=torch.strided, device=None, requires_grad=False, memory_format=torch.preserve_format)

mindspore.Tensor.randint_like

mindspore.Tensor.randint_like(self, high, low=0, seed=None)
Examples:
    >>> import mindspore
    >>> import numpy as np
    >>> from mindspore import Tensor
    >>> input_x = Tensor(np.array([1., 2., 3., 4., 5.]), mindspore.float32)
    >>> output = input_x.randint_like(20, seed = 0)
    >>> print(output)
        [12 15  0  3  3]
    >>> output = input_x.randint_like(20, 100, seed = 0)
    >>> print(output)
        [64 67 84 87 87]

(7) 比较与torch.randn_like功能差异

更多内容详见torch.randn_like。
torch.randn_like

randn_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format)

mindspore.Tensor.randn_like

mindspore.Tensor.randn_like(self, seed=None)
    >>> import mindspore
    >>> import numpy as np
    >>> from mindspore import Tensor
    >>> input_x = Tensor(np.array([1., 2., 3., 4., 5.]), mindspore.float32)
    >>> output = input_x.randn_like(seed = 0)
    >>> print(output)
        [1.7640524 0.4001572 0.978738  2.2408931 1.867558 ]
    >>> input_p = Tensor(np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]), mindspore.int32)
    >>> output = input_p.randn_like(seed = 0)
    >>> print(output)
        [[ 1.7640524   0.4001572   0.978738    2.2408931   1.867558  ]
        [-0.9772779   0.95008844 -0.1513572  -0.10321885  0.41059852]]

(8) 比较与torch.sparse_coo_tensor功能差异

更多内容详见torch.sparse_coo_tensor。
torch.sparse_coo_tensor

torch.sparse_coo_tensor(indices, values, size=None, *, dtype=None, device=None, requires_grad=False)
Example:
    >>> i = torch.tensor([[0, 1, 1],
    ...                   [2, 0, 2]])
    >>> v = torch.tensor([3, 4, 5], dtype=torch.float32)
    >>> torch.sparse_coo_tensor(i, v, [2, 4])
    tensor(indices=tensor([[0, 1, 1],
                           [2, 0, 2]]),
           values=tensor([3., 4., 5.]),
           size=(2, 4), nnz=3, layout=torch.sparse_coo)
 
    >>> torch.sparse_coo_tensor(i, v)  # Shape inference
    tensor(indices=tensor([[0, 1, 1],
                           [2, 0, 2]]),
           values=tensor([3., 4., 5.]),
           size=(2, 3), nnz=3, layout=torch.sparse_coo)
 
    >>> torch.sparse_coo_tensor(i, v, [2, 4],
    ...                         dtype=torch.float64,
    ...                         device=torch.device('cuda:0'))
    tensor(indices=tensor([[0, 1, 1],
                           [2, 0, 2]]),
           values=tensor([3., 4., 5.]),
           device='cuda:0', size=(2, 4), nnz=3, dtype=torch.float64,
           layout=torch.sparse_coo)
 
    # Create an empty sparse tensor with the following invariants:
    #   1. sparse_dim + dense_dim = len(SparseTensor.shape)
    #   2. SparseTensor._indices().shape = (sparse_dim, nnz)
    #   3. SparseTensor._values().shape = (nnz, SparseTensor.shape[sparse_dim:])
    #
    # For instance, to create an empty sparse tensor with nnz = 0, dense_dim = 0 and
    # sparse_dim = 1 (hence indices is a 2D tensor of shape = (1, 0))
    >>> S = torch.sparse_coo_tensor(torch.empty([1, 0]), [], [1])
    tensor(indices=tensor([], size=(1, 0)),
           values=tensor([], size=(0,)),
           size=(1,), nnz=0, layout=torch.sparse_coo)
 
    # and to create an empty sparse tensor with nnz = 0, dense_dim = 1 and
    # sparse_dim = 1
    >>> S = torch.sparse_coo_tensor(torch.empty([1, 0]), torch.empty([0, 2]), [1, 2])
    tensor(indices=tensor([], size=(1, 0)),
           values=tensor([], size=(0, 2)),
           size=(1, 2), nnz=0, layout=torch.sparse_coo)

mindspore.COOTensor.sparse_coo_tensor

mindspore.COOTensor.sparse_coo_tensor(
    indices=None,
    values=None,
    shape=None,
    coo_tensor=None,
)
Examples:
    >>> import mindspore as ms
    >>> import mindspore.nn as nn
    >>> from mindspore import COOTensor, Tensor
    >>> indices = Tensor([[0, 1], [1, 2]], dtype=ms.int32)
    >>> values = Tensor([1, 2], dtype=ms.float32)
    >>> shape = (3, 4)
    >>> x = COOTensor.sparse_coo_tensor(indices, values, shape)
    >>> print(x.values)
    [1. 2.]
    >>> print(x.indices)
    [[0 1]
     [1 2]]
    >>> print(x.shape)
    (3, 4)

(9) 比较与torch.as_tensor功能差异

更多内容详见torch.as_tensor。
torch.as_tensor

torch.as_tensor(data, dtype=None, device=None)
Example:
    >>> a = numpy.array([1, 2, 3])
    >>> t = torch.as_tensor(a)
    >>> t
    tensor([ 1,  2,  3])
    >>> t[0] = -1
    >>> a
    array([-1,  2,  3])
 
    >>> a = numpy.array([1, 2, 3])
    >>> t = torch.as_tensor(a, device=torch.device('cuda'))
    >>> t
    tensor([ 1,  2,  3])
    >>> t[0] = -1
    >>> a
array([1,  2,  3])

mindspore.Tensor.as_tensor

mindspore.Tensor.as_tensor(data, dtype=None)
Examples:
    >>> import numpy as np
    >>> import mindspore as ms
    >>> import mindspore.nn as nn
    >>> from mindspore import Tensor
    >>> input_data = np.array([1, 2, 3])
    >>> ms_tensor = Tensor.as_tensor(input_data)
    >>> ms_tensor
    Tensor(shape=[3], dtype=Int64, value= [1, 2, 3])

(10) 比较与torch.as_strided功能差异

更多内容详见torch.as_strided。
torch.as_strided

torch.as_strided(input, size, stride, storage_offset=0)
Example:
    >>> x = torch.randn(3, 3)
    >>> x
    tensor([[ 0.9039,  0.6291,  1.0795],
            [ 0.1586,  2.1939, -0.4900],
            [-0.1909, -0.7503,  1.9355]])
    >>> t = torch.as_strided(x, (2, 2), (1, 2))
    >>> t
    tensor([[0.9039, 1.0795],
            [0.6291, 0.1586]])
    >>> t = torch.as_strided(x, (2, 2), (1, 2), 1)
    tensor([[0.6291, 0.1586],
            [1.0795, 2.1939]])

mindspore.Tensor.as_strided

mindspore.Tensor.as_strided(
    self,
    shape=None,
    strides=None,
    subok=False,
    writeable=True,
)
Examples:
    >>> import numpy as np
    >>> from mindspore import Tensor
    >>> X = np.arange(9, dtype=np.int32).reshape(3,3)
    >>> output = Tensor(X).as_strided((2, 2), (1, 1))
    >>> print(output)
    [[0 1]
     [1 2]]

(11) 比较与torch.frombuffer功能差异

更多内容详见torch.frombuffer。
torch.frombuffer

torch.frombuffer(buffer, *, dtype, count=-1, offset=0, requires_grad=False)
Example:
    >>> import array
    >>> a = array.array('i', [1, 2, 3])
    >>> t = torch.frombuffer(a, dtype=torch.int32)
    >>> t
    tensor([ 1,  2,  3])
    >>> t[0] = -1
    >>> a
    array([-1,  2,  3])
 
    >>> # Interprets the signed char bytes as 32-bit integers.
    >>> # Each 4 signed char elements will be interpreted as
    >>> # 1 signed 32-bit integer.
    >>> import array
    >>> a = array.array('b', [-1, 0, 0, 0])
    >>> torch.frombuffer(a, dtype=torch.int32)
    tensor([255], dtype=torch.int32)

mindspore.Tensor.frombuffer

mindspore.Tensor.frombuffer(
    buffer,
    dtype=mindspore.float64,
    count=-1,
    offset=0,
)
Examples:
    >>> from array import array
    >>> import numpy as np
    >>> import mindspore
    >>> from mindspore import Tensor
    >>> input_array = array("d", [1, 2, 3, 4])
    >>> input_array
    array('d', [1.0, 2.0, 3.0, 4.0])
    >>> output = Tensor.frombuffer(input_array, mindspore.int32)
    >>> print(output)
    [1 2 3 4]

(12) 比较与torch.empty_strided功能差异

更多内容详见torch.empty_strided。
torch.empty_strided

torch.empty_strided(size, stride, *, dtype=None, layout=None, device=None, requires_grad=False, pin_memory=False)
Example:
    >>> a = torch.empty_strided((2, 3), (1, 2))
    >>> a
    tensor([[8.9683e-44, 4.4842e-44, 5.1239e+07],
            [0.0000e+00, 0.0000e+00, 3.0705e-41]])
    >>> a.stride()
    (1, 2)
    >>> a.size()
    torch.Size([2, 3])

mindspore.Tensor.empty_strided

mindspore.Tensor.empty_strided(
    size,
    stride,
    dtype=mindspore.float64,
    seed=None,
)
Examples:
    >>> from mindspore import Tensor
    >>> size = (3, 3)
    >>> stride = (1, 3)
    >>> output = Tensor.empty_strided(size, stride, seed = 0)
    >>> print(output)
    [[5.48813504e+10 7.15189366e+10 6.02763376e+10]
     [5.44883183e+10 4.23654799e+10 6.45894113e+10]
     [4.37587211e+10 8.91773001e+10 0.00000000e+00]]

(13) 比较与torch.randperm功能差异

更多内容详见torch.randperm。
torch.randperm

randperm(n, *, generator=None, out=None, dtype=torch.int64,layout=torch.strided, device=None, requires_grad=False, pin_memory=False)
Example:
>>> torch.randperm(4)
tensor([2, 1, 0, 3])

mindspore.Tensor.randperm

mindspore.Tensor.randperm(self, max_length=1, pad=-1)
Examples:
    >>> # The result of every execution is different because this operator will generate n random samples.
    >>> from mindspore import Tensor
    >>> import mindspore
    >>> n = Tensor([20], dtype=mindspore.int32)
    >>> output = n.randperm(max_length=30, pad=-1)
    >>> print(output)
    [15 6 11 19 14 16 9 5 13 18 4 10 8 0 17 2 1 12 3 7
        -1 -1 -1 -1 -1 -1 -1 -1 -1 -1]

Step 6 中文文档标准参考

在开发底层函数的过程中,需要为之编写中文文档,我在这个过程中也学习了MindSpore的文档标准。

  1. 对应的参数需要有空格
创建现有张量的视图,具有指定的`shape`、`stead`和`subok`。
创建现有张量的视图,具有指定的 `shape` 、 `stead` 和 `subok` 。
  1. 需要有标准的标点符号
- **shape** (tuple或ints) - 输出张量的形状
- **shape** (tuple或ints) - 输出张量的形状。
  1. 对应参数需要有标记
		bufferobject):公开缓冲区接口的Python对象。
		- **buffer** (object) - 公开缓冲区接口的Python对象。

        TypeError:如果`seed`和`seed2`都不是int
        - **TypeError** - 如果 `seed` 和 `seed2` 都不是int
  1. 张量统一使用Tensor
    创建现有张量的视图,具有指定的`shape`、`stead`和`subok`。
    创建现有Tensor的视图,具有指定的 `shape` 、 `stead` 和 `subok` 。
  1. 对数据类型的描述要标准化
		- **shape** (tuple或ints) - 输出张量的形状
		- **shape** (Union[tuple, ints]) - 输出Tensor的shape。

        - **size** (tuple:ints) - 输出张量的形状。
        - **size** (tuple[int]) - 输出Tensor的shape。
  1. 有关单词需要标准化
        Tensor,数据类型在mindspore的数据类型中。
        Tensor,数据类型在MindSpore的数据类型中。

Step 7 小结

(1)修改mindspore/python/common/tensor.py中的Tensor类下函数即可得到对应功能。由于是Python函数,修改后只需要重新导入包,不需要重新编译,也可以在common目录下创建tensor_func.py,但为了统一性和完整性,应该在tensor.py下创建。
(2)需要编写对应的ut和st测试代码。ut主要测试能否正常运行,st主要检测对应样例计算结果是否符合预期。ut结果编译而不能运算,st结果可以运算。
(3)MindSpore的测试主要通过pytest框架实现,其pytest格式如下(不适用于ut,适用于st):

  • (1) Ascend910 单P用例:
    @pytest.mark.level0
    @pytest.mark.platform_arm_ascend_training
    @pytest.mark.platform_x86_ascend_training
    @pytest.mark.env_onecard
    def test_xxxx():
    xxx xxx
  • (2) GPU单P用例:
    @pytest.mark.level0
    @pytest.mark.platform_x86_gpu_training
    @pytest.mark.env_onecard
    def test_xxxx():
    xxx xxx
  • (3) CPU用例:
    @pytest.mark.level0
    @pytest.mark.platform_x86_cpu
    @pytest.mark.platform_arm_cpu
    @pytest.mark.env_onecard
    def test_xxxxx():
    xxx xxx

(4)利用ops接口开发Tensor创建的有关接口。
①以poisson为例,在mindspore/python/mindspore/ops/functional.py中添加如下语句,注册算子:

tensor_operator_registry.register('poisson', P.Poisson)

②在tensor.py的tensor类添加如下函数:

def poisson(self, shape, seed=0, seed2=0):
	r"""
        Returns a tensor of the same size as `input` with each element sampled from a Poisson
        distribution with rate parameter given by the corresponding element in `input` i.e.,
        \text{out}_i \sim \text{Poisson}(\text{input}_i)out*i*∼Poisson(input*i*),
        and self as a tensor is the μ parameter .the distribution was constructed with.
        The parameter defines mean number of occurrences of the event.
        It must be greater than 0. With float32 data type.
 
        Args:
            seed (int, option): set the random seed (0 to 2**32)
            seed2 (int, option): set the random seed2 (0 to 2**32)
        Inputs:
            - **shape** (tuple) - The shape of random tensor to be generated. Only constant value is allowed.
 
        Returns:
            out (Union[Tensor, int]), with the same shape as input_tensor.
 
        Raises:
            TypeError: If neither `seed` nor `seed2` is an int.
            TypeError: If `shape` is not a tuple.
            TypeError: If `mean` is not a Tensor whose dtype is not float32.
 
 
        Supported Platforms:
            ``Ascend``
        Examples:
            >>> shape = (4, 1)
            >>> mean = Tensor(np.array([5.0, 10.0]), mstype.float32)
            >>> output = mean.Poisson(shape, seed=5)
            >>> result = output.shape
            >>> print(result)
            (4, 2)
        """
        self._init_check()
        validator.check_non_negative_int(seed, 'seed')
        validator.check_non_negative_int(seed2, 'seed')
        return tensor_operator_registry.get('poisson')(seed, seed2)(shape, self)

(5)在编写有关类的成员函数时,类中的函数用pylint需要类中函数含有self参数,如果有特殊要求,需要把这个方法变成静态方法,用@staticmethod修饰;或者申请屏蔽。用@staticmethod修饰的方法需要在普通的方法前面,同时在@property修饰的方法之后。

(6)在设计初始化函数时,mindspore有关对象可能还未完全初始化,因此可以采用from mindspore.common import dtype as mstype的方式初始化dtype

(7)在GPU上,mindspore是支持mstype.int8数据类型进行StridedSlice的,但是在Ascend和cpu尚不支持。

(8)在st中,需要增加对应的设置图模式,device设置为相应的device[‘CPU’, ‘GPU’, ‘Ascend’, ‘Davinci’]。

context.set_context(mode=context.GRAPH_MODE, device_target="Ascend")

四、感悟和展望

感谢王东海老师和邵俊松老师的循循善诱和专业、耐心、细致的指点,多次为PR提供了详尽周到的意见。没有王老师和邵老师的帮助和鼎力支持,本项目不可能精彩地完成。感谢中国科学院软件研究所和OSPP开源之夏组委会,给了我一个学习知识、培养能力、树立开源精神的机会,有幸结识了优秀的开源导师和经验丰富的开发者。OSPP使全球数百名开发者通过开源之夏的平台互相学习、分享知识,携手共进,与项目导师畅聊项目和开源,不断迸发出新的思想火花。
在这个过程中遇到了许多困难,但我没有放弃,而是选择迎难而上。由于本项目涉及许多偏向于底层的代码涉及,平常对Python深度框架偏重于利用深度学习库和有关函数构建网络和训练、测试,因此在本项目上,不少问题需要较长的研究时间,例如对于有关函数和库之间关系的相互调用操作不熟练,对UT、ST和Pytest、Pylint、Gitee代码门禁等测试功能和方法不够了解,等等。但我坚信,只要我们努力克服了困难,那么胜利就离我们越来越近。
在开发过程中,王老师拉上邵老师和我建了微信群,每完成一个开发节点,我都准时在群里向王老师汇报开发情况,王老师会对开发进度及时跟进,遇到的大多数问题都能当天解决。王老师和邵老师告诉我,初入开源,打牢前辈们所积累的经验教训,往后边走边回味,肯定比单纯勤奋埋头干更明智。当遇到一个有趣的问题时,王老师和邵老师循循善诱,鼓励我提出自己的想法,同时为我分享作为前辈的开发经验,加深了我对代码门禁、PR、UT和ST测试等概念的理解,解决了我在开发中遇到的算子注册、成员函数、装饰器和模块之间的耦合关系等问题。这个暑假,我常常和王老师、邵老师在微信群里聊得热火朝天,在探讨和代码实践中不断提升开源素养。
在这个项目的研究过程中,我收获颇丰,不仅学习到了许多关于深度学习框架的操作和编写技巧,还增强了我的沟通能力和分析、解决问题的能力。当然,这次活动带给我的远不止这些,更我懂得了要想成功就必须付出百倍的努力,需要耐心细致,需要对困难要充满挑战的乐趣,正如古语所云:“道阻且长,行则将至;行而不辍,未来可期。”
通过开源之夏项目学习,我还了解到,“开源”一词最早出现在编程语言 GNU/ Linux 中,强调对开放网络社区和自由软件的支持与贡献。开发者要提供源代码给网友和用户,并允许他人进行改动或是再传播,使程序始终保持领先。它包含了三层意义:“开源”、“开放”和“免费”。王老师告诉我,要成为一个合格的开源工作者,首先应该具备一些技术实力,无论是程序语言、编译还是操作系统都必须有扎实而坚定的基础。另外,需要有学习、交流的方法,同时,了解自己所写代码能达到什么水平,每天要保证有一定长度的阅读与书写,并通过大量的实例练习巩固基础。技术能够正常运转自如了,就能参加或组建自己喜欢的开源项目并向更高难度挑战,坚信程序就如同产品,总需慢慢完善,以适应用户日益变化的需求。除此之外,刚参与编写开源程序的朋友来说最重要还是要锻炼自己的心态,培养良好的逻辑思维、清晰的头脑以及认真负责的态度。总之,开源之夏OSPP带给我的经历,不仅是代码能力的提升,更是一种综合素养的提升。

  • 5
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

irrationality

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值