深入浅出Pytorch函数——torch.softmax/torch.nn.functional.softmax

von Neumann

已于 2023-09-25 19:50:48 修改

阅读量1w

点赞数 4

分类专栏：深入浅出Pytorch函数文章标签：人工智能深度学习 pytorch softmax functional

于 2023-07-23 21:13:33 首次发布

本文链接：https://blog.csdn.net/hy592070616/article/details/131884154

版权

深入浅出Pytorch函数专栏收录该内容

47 篇文章

订阅专栏

Softmax函数是Pytorch中用于处理神经网络激活的一种方法，它将张量的每个维度上的元素按比例缩放至0到1之间，并保证总和为1。该函数可以沿着指定的维度应用，例如在分类问题中常用于输出层。torch.nn.functional.softmax和torch.nn.Softmax模块提供了这一功能，支持指定数据类型以避免溢出。在使用时，注意它不适用于直接配合NLLLoss，应优先考虑使用log_softmax。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

分类目录：《深入浅出Pytorch函数》总目录
相关文章：
· 机器学习中的数学——激活函数：Softmax函数
· 深入浅出Pytorch函数——torch.softmax/torch.nn.functional.softmax
· 深入浅出Pytorch函数——torch.nn.Softmax

将Softmax函数应用于沿dim的所有切片，并将重新缩放它们，使元素位于 $[0, 1]$ 的范围内并和为1。

语法

torch.softmax(input, dim, *, dtype=None) -> Tensor
torch.nn.functional.softmax(input, dim=None, _stacklevel=3, dtype=None) -> Tensor

参数

input：[Tensor] 输入张量
dim：[int] Softmax函数将沿着dim轴计算，即沿dim的每个切片的和为1
dtype：[可选, torch.dtype] 想要返回张量的数据类型。如果指定，则在执行操作之前将输入张量强制转换为dtype。这对于防止数据类型溢出非常有用。默认值为None

返回值

与input具有相同形状且值在[0，1]范围内的Tensor。

实例

>>> x = torch.randn(4, 5)
>>> torch.nn.functional.softmax(x, dim=0)
tensor([[0.4259, 0.5448, 0.1935, 0.3904, 0.1963],
        [0.1370, 0.1053, 0.1966, 0.2625, 0.4343],
        [0.0540, 0.2823, 0.5101, 0.2082, 0.0905],
        [0.3832, 0.0676, 0.0998, 0.1390, 0.2789]])

>>> torch.nn.functional.softmax(x, dim=1)
tensor([[0.0728, 0.1720, 0.1233, 0.5541, 0.0779],
        [0.1139, 0.0667, 0.2677, 0.4361, 0.1156],
        [0.2576, 0.0811, 0.0671, 0.2975, 0.2968],
        [0.1874, 0.0358, 0.2240, 0.4470, 0.1057]])

函数实现

def softmax(input: Tensor, dim: Optional[int] = None, _stacklevel: int = 3, dtype: Optional[DType] = None) -> Tensor:
    r"""Applies a softmax function.

    Softmax is defined as:

    :math:`\text{Softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_j \exp(x_j)}`

    It is applied to all slices along dim, and will re-scale them so that the elements
    lie in the range `[0, 1]` and sum to 1.

    See :class:`~torch.nn.Softmax` for more details.

    Args:
        input (Tensor): input
        dim (int): A dimension along which softmax will be computed.
        dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.
          If specified, the input tensor is casted to :attr:`dtype` before the operation
          is performed. This is useful for preventing data type overflows. Default: None.

    .. note::
        This function doesn't work directly with NLLLoss,
        which expects the Log to be computed between the Softmax and itself.
        Use log_softmax instead (it's faster and has better numerical properties).

    """
    if has_torch_function_unary(input):
        return handle_torch_function(softmax, (input,), input, dim=dim, _stacklevel=_stacklevel, dtype=dtype)
    if dim is None:
        dim = _get_softmax_dim("softmax", input.dim(), _stacklevel)
    if dtype is None:
        ret = input.softmax(dim)
    else:
        ret = input.softmax(dim, dtype=dtype)
    return ret