log_softmax看值与概率的关系，值越大/小概率越大/小？

最新推荐文章于 2024-07-22 11:48:26 发布

江南蜡笔小新

最新推荐文章于 2024-07-22 11:48:26 发布

阅读量2.6k

点赞数 1

分类专栏： pytorch 文章标签：深度学习机器学习 NLP 分类算法 softmax

本文链接：https://blog.csdn.net/ftimes/article/details/119648876

版权

pytorch 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

本文通过 log_softmax源码分析看值与概率的关系，log_softmax在pytorch中常见的有两个实现，一个是torch.nn.LogSoftmax()，一个是F.log_softmax()。

太长不看可以直接点 3.小结

1. F.log_softmax()源码

```python
def log_softmax(input, dim=None, _stacklevel=3, dtype=None):
    # type: (Tensor, Optional[int], int, Optional[int]) -> Tensor
    if dim is None:
        dim = _get_softmax_dim('log_softmax', input.dim(), _stacklevel)
    if dtype is None:
        ret = input.log_softmax(dim)
    else:
        ret = input.log_softmax(dim, dtype=dtype)
    return ret
```
对softmax的结果计算Log。

虽然在数学上等同于log（softmax（x）），但是

单独操作速度较慢，且数值不稳定。此函数

能够正确计算输出和梯度。

有关详细信息，请参见：class:`~torch.nn.LogSoftmax`。

论据：

输入（张量）：输入

dim（int）：将沿其计算log_softmax的维度。

dtype（：class:`torch.dtype`，可选）：返回的张量的所需数据类型。

这对于防止数据类型溢出非常有用。

2. torch.nn.LogSoftmax()

```python
class LogSoftmax(Module):
    __constants__ = ['dim']
    def __init__(self, dim=None):
        super(LogSoftmax, self).__init__()
        self.dim = dim

    def __setstate__(self, state):
        self.__dict__.update(state)
        if not hasattr(self, 'dim'):
            self.dim = None

    def forward(self, input):
        return F.log_softmax(input, self.dim, _stacklevel=5)
```

参考官方文档：
在这里插入图片描述（转载请保留本文链接：https://blog.csdn.net/ftimes/article/details/119648876）

3. 小结

根据源代码与官方文档，我们可以得出这样的结论：log_softmax是一一种更为强健的方式计算log(softmax)这一对数概率。

同时，在官方文档 torch.nn.Softmax(dim=None) 一节中，也提示到：
This module doesn’t work directly with NLLLoss, which expects the Log to be computed between the Softmax and itself. Use LogSoftmax instead (it’s faster and has better numerical properties).

值得强调的是，在pytorch中使用的是log这一符号，约定俗称的为以e为底，所以，通常情况下，我们可以通过torch.exp()还原回softmax()。

Out[11]: 
tensor([ -8.7414,  -8.5590, -10.5708,  -3.1370,  -4.8619,  -4.5376,  -3.7627,
         -3.6359,  -5.8785,  -3.8922,  -4.7725,  -6.0678,  -3.1159,  -4.4447,
         -2.2655,  -1.6365,  -4.1371,  -5.2297,  -5.4012,  -6.4455,  -4.3431,
         -1.0044,  -3.7157,  -3.3281,  -5.4026,  -3.5647], device='cuda:0')

torch.exp(current_log_prob[0,:])
Out[13]: 
tensor([1.5982e-04, 1.9181e-04, 2.5654e-05, 4.3415e-02, 7.7354e-03, 1.0699e-02,
        2.3220e-02, 2.6360e-02, 2.7991e-03, 2.0400e-02, 8.4595e-03, 2.3162e-03,
        4.4339e-02, 1.1741e-02, 1.0378e-01, 1.9466e-01, 1.5969e-02, 5.3552e-03,
        4.5112e-03, 1.5876e-03, 1.2996e-02, 3.6627e-01, 2.4338e-02, 3.5860e-02,
        4.5050e-03, 2.8307e-02], device='cuda:0')
        
torch.exp(current_log_prob[0,:]).sum()
Out[14]: tensor(1.0000, device='cuda:0')

在这里插入图片描述附上一张lnx的函数图像（×在0-1区间内）

故有：log_softmax的值越大（由于log_softmax的值域为[-inf，0)，越大指越接近0），则softmax的值越接近1，也就是通常意义上的概率越大。

通过代码，我们也能验证该结果：

current_log_prob[0,:].argmax()
Out[15]: tensor(21, device='cuda:0')

current_log_prob[0,:][21]
Out[16]: tensor(-1.0044, device='cuda:0')

torch.exp(current_log_prob[0,:]).argmax()
Out[17]: tensor(21, device='cuda:0')

torch.exp(current_log_prob[0,:])[21]
Out[18]: tensor(0.3663, device='cuda:0')

torch.exp(current_log_prob[0,:]).max()
Out[19]: tensor(0.3663, device='cuda:0')

江南蜡笔小新

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
log_softmax看值与概率的关系，值越大/小概率越大/小？

本文通过 log_softmax源码分析看值与概率的关系，log_softmax在pytorch中常见的有两个实现，一个是torch.nn.LogSoftmax()，一个是F.log_softmax()。F.log_softmax()源码def log_softmax(input, dim=None, _stacklevel=3, dtype=None): # type: (Tensor, Optional[int], int, Optional[int]) -> Tensor
复制链接

扫一扫

专栏目录