飞桨中softmax_with_cross_entropy的用法

最新推荐文章于 2024-03-18 17:57:23 发布

AlexDish

最新推荐文章于 2024-03-18 17:57:23 发布

阅读量2.3k

点赞数 3

分类专栏：深度学习神经网络 Python 文章标签： paddlepaddle 深度学习 python

本文链接：https://blog.csdn.net/AlexDish/article/details/112726525

版权

神经网络同时被 3 个专栏收录

14 篇文章 0 订阅

订阅专栏

Python

13 篇文章 0 订阅

订阅专栏

深度学习

11 篇文章 3 订阅

订阅专栏

飞桨中softmax_with_cross_entropy的用法

本文只作为笔记用。
环境：paddle 1.8.5

一直以来不太清楚这个如何调用，只知道例程里是这样用，不知所以然。
下面记录一下它的用法。
源码如下：

def softmax_with_cross_entropy(logits,
                               label,
                               soft_label=False,
                               ignore_index=kIgnoreIndex,
                               numeric_stable_mode=True,
                               return_softmax=False,
                               axis=-1):
    """
    This operator implements the cross entropy loss function with softmax. This function 
    combines the calculation of the softmax operation and the cross entropy loss function 
    to provide a more numerically stable gradient.

    Because this operator performs a softmax on logits internally, it expects
    unscaled logits. This operator should not be used with the output of
    softmax operator since that would produce incorrect results.

    When the attribute :attr:`soft_label` is set :attr:`False`, this operators 
    expects mutually exclusive hard labels, each sample in a batch is in exactly 
    one class with a probability of 1.0. Each sample in the batch will have a 
    single label.

    The equation is as follows:

    1) Hard label (one-hot label, so every sample has exactly one class)

    .. math::

        loss_j =  -\\text{logits}_{label_j} +
        \\log\\left(\\sum_{i=0}^{K}\\exp(\\text{logits}_i)\\right), j = 1,..., K

    2) Soft label (each sample can have a distribution over all classes)

    .. math::

        loss_j =  -\\sum_{i=0}^{K}\\text{label}_i
        \\left(\\text{logits}_i - \\log\\left(\\sum_{i=0}^{K}
        \\exp(\\text{logits}_i)\\right)\\right), j = 1,...,K

    3) If :attr:`numeric_stable_mode` is :attr:`True`, softmax is calculated first by:

    .. math::

        max_j &= \\max_{i=0}^{K}{\\text{logits}_i}

        log\\_max\\_sum_j &= \\log\\sum_{i=0}^{K}\\exp(logits_i - max_j)

        softmax_j &= \\exp(logits_j - max_j - {log\\_max\\_sum}_j)

    and then cross entropy loss is calculated by softmax and label.

    Args:
        logits (Variable): A multi-dimension ``Tensor`` , and the data type is float32 or float64. The input tensor of unscaled log probabilities.
        label (Variable): The ground truth  ``Tensor`` , data type is the same
            as the ``logits`` . If :attr:`soft_label` is set to :attr:`True`, 
            Label is a ``Tensor``  in the same shape with :attr:`logits`. 
            If :attr:`soft_label` is set to :attr:`True`, Label is a ``Tensor`` 
            in the same shape with :attr:`logits` expect shape in dimension :attr:`axis` as 1.
        soft_label (bool, optional): A flag to indicate whether to interpretant the given
            labels as soft labels. Default False.
        ignore_index (int, optional): Specifies a target value that is ignored and does
                                      not contribute to the input gradient. Only valid
                                      if :attr:`soft_label` is set to :attr:`False`. 
                                      Default: kIgnoreIndex(-100).
        numeric_stable_mode (bool, optional): A flag to indicate whether to use a more
                                              numerically stable algorithm. Only valid
                                              when :attr:`soft_label` is :attr:`False` 
                                              and GPU is used. When :attr:`soft_label` 
                                              is :attr:`True` or CPU is used, the 
                                              algorithm is always numerically stable.
                                              Note that the speed may be slower when use
                                              stable algorithm. Default: True.
        return_softmax (bool, optional): A flag indicating whether to return the softmax
                                         along with the cross entropy loss. Default: False.
        axis (int, optional): The index of dimension to perform softmax calculations. It 
                              should be in range :math:`[-1, rank - 1]`, while :math:`rank`
                              is the rank of input :attr:`logits`. Default: -1.

    Returns:
        ``Variable`` or Tuple of two ``Variable`` : Return the cross entropy loss if \
                                                    `return_softmax` is False, otherwise the tuple \
                                                    (loss, softmax), softmax is in the same shape \
                                                    with input logits and cross entropy loss is in \
                                                    the same shape with input logits except shape \
                                                    in dimension :attr:`axis` as 1.

    Examples:
        .. code-block:: python

            import paddle.fluid as fluid

            data = fluid.data(name='data', shape=[-1, 128], dtype='float32')
            label = fluid.data(name='label', shape=[-1, 1], dtype='int64')
            fc = fluid.layers.fc(input=data, size=100)
            out = fluid.layers.softmax_with_cross_entropy(
                logits=fc, label=label)
    """
    if in_dygraph_mode():
        softmax, loss = core.ops.softmax_with_cross_entropy(
            logits, label, 'soft_label', soft_label, 'ignore_index',
            ignore_index, 'numeric_stable_mode', numeric_stable_mode, 'axis',
            axis)
        if not return_softmax:
            return loss
        else:
            return loss, softmax

    attrs = {
        'soft_label': soft_label,
        'ignore_index': ignore_index,
        'numeric_stable_mode': numeric_stable_mode,
        'axis': axis
    }
    helper = LayerHelper('softmax_with_cross_entropy', **locals())
    softmax = helper.create_variable_for_type_inference(dtype=logits.dtype)
    loss = helper.create_variable_for_type_inference(dtype=logits.dtype)
    helper.append_op(
        type='softmax_with_cross_entropy',
        inputs={'Logits': logits,
                'Label': label},
        outputs={'Softmax': softmax,
                 'Loss': loss},
        attrs=attrs)

    if return_softmax:
        return loss, softmax

    return loss

一开始我以为，它会使用两个相同的array丢进去就OK了。
但是以下做法是错误的。

logit_y = np.array([[1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11]]).astype(np.float32)
output_y = np.array([[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]).astype(np.float32) # ng
logit_y = fluid.dygraph.to_variable(logit_y)
output_y = fluid.dygraph.to_variable(output_y)
print(logit_y.shape, output_y.shape)
print(logit_y.numpy(), output_y.numpy())
loss = fluid.layers.softmax_with_cross_entropy(logit_y, output_y)
print(loss.numpy()[0])

以下做法是错误的。

logit_y = np.array([[1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11]]).astype(np.float32)
output_y = np.array([[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]).astype(np.float32) # ng
logit_y = fluid.dygraph.to_variable(logit_y)
output_y = fluid.dygraph.to_variable(output_y)
print(logit_y.shape, output_y.shape)
print(logit_y.numpy(), output_y.numpy())
loss = fluid.layers.softmax_with_cross_entropy(logit_y, output_y)
print(loss.numpy()[0])

错误信息：

EnforceNotMet                             Traceback (most recent call last)
<ipython-input-46-4960b363e8fb> in <module>
      7 print(logit_y.shape, output_y.shape)
      8 print(logit_y.numpy()[0], output_y.numpy()[0])
----> 9 loss = fluid.layers.softmax_with_cross_entropy(logit_y, output_y)
     10 print(loss.numpy()[0])

d:\programdata\anaconda3\envs\parl\lib\site-packages\paddle\fluid\layers\loss.py in softmax_with_cross_entropy(logits, label, soft_label, ignore_index, numeric_stable_mode, return_softmax, axis)
   1254             logits, label, 'soft_label', soft_label, 'ignore_index',
   1255             ignore_index, 'numeric_stable_mode', numeric_stable_mode, 'axis',
-> 1256             axis)
   1257         if not return_softmax:
   1258             return loss

EnforceNotMet: 

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
Windows not support stack backtrace yet.

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: If Attr(soft_label) == false, the axis dimension of Input(Label) should be 1.
  [Hint: Expected labels_dims[axis] == 1UL, but received labels_dims[axis]:4 != 1UL:1.] at (D:\1.8.5\paddle\paddle\fluid\operators\softmax_with_cross_entropy_op.cc:174)
  [operator < softmax_with_cross_entropy > error]

在paddle里，它的softmax_with_cross_entropy的第一个参数输入的是对应你多种类别分别的概率，第二个参数是类别。
以下是正确的结果的code。

logit_y = np.array([[1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11]]).astype(np.float32)
output_y = np.array([[0], [0], [0], [0]]).astype(np.int64)  # ok
logit_y = fluid.dygraph.to_variable(logit_y)
output_y = fluid.dygraph.to_variable(output_y)
print(logit_y.shape, output_y.shape)
print(logit_y.numpy(), output_y.numpy())
loss = fluid.layers.softmax_with_cross_entropy(logit_y, output_y)
print(loss.numpy()[0])
# logit_y
# [[1.23 2.33 3.33 2.11]
#  [1.23 2.33 3.33 2.11]
#  [1.23 2.33 3.33 2.11]
#  [1.23 2.33 3.33 2.11]]
# 结果
# [[2.6797354]
#  [2.6797354]
#  [2.6797354]
#  [2.6797354]]

或者

logit_y = np.array([[1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11], [1.23, 2.33, 3.33, 2.11]]).astype(np.float32)
output_y = np.array([[0], [0], [0], [0]]).astype(np.int64)  # ok
logit_y = fluid.dygraph.to_variable(logit_y)
output_y = fluid.dygraph.to_variable(output_y)
logit_y = fluid.layers.softmax(logit_y)
print(logit_y.shape, output_y.shape)
print(logit_y.numpy(), output_y.numpy()[0])
loss = fluid.layers.softmax_with_cross_entropy(logit_y, output_y)
print(loss.numpy())
# logit_y
# [[0.0685813  0.2060296  0.5600465  0.16534261]
#  [0.0685813  0.2060296  0.5600465  0.16534261]
#  [0.0685813  0.2060296  0.5600465  0.16534261]
#  [0.0685813  0.2060296  0.5600465  0.16534261]]
# 结果
# [[1.5858928]
#  [1.5858928]
#  [1.5858928]
#  [1.5858928]]

AlexDish

关注

3
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
飞桨中softmax_with_cross_entropy的用法

飞桨中softmax_with_cross_entropy的用法本文只作为笔记用。环境：paddle 1.8.5一直以来不太清楚这个如何调用，只知道例程里是这样用，不知所以然。下面记录一下它的用法。源码如下：def softmax_with_cross_entropy(logits, label, soft_label=False,
复制链接

扫一扫