torch.nn.Softmax()用法

Lins H

已于 2022-09-25 09:22:04 修改

阅读量1.4k

点赞数 1

分类专栏：机器学习

于 2022-09-23 10:45:50 首次发布

本文链接：https://blog.csdn.net/Hu_Linson/article/details/127005653

版权

机器学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

目前网络上对于softmax讲解只停留在三维空间
在4维空间里，即B,C,H,W维度关于Softmax不同维度归一化

Softmax(dim=0)
对第0维度进行归一化，即所有Batch之和为1

import torch
x = torch.arange(12, dtype=torch.float32).reshape(1, 3, 2, 2)
print(x)
y = torch.nn.Softmax(dim=0)(x)
print(y)

tensor([[[[ 0.,  1.],
          [ 2.,  3.]],
         [[ 4.,  5.],
          [ 6.,  7.]],
         [[ 8.,  9.],
          [10., 11.]]]])
tensor([[[[1., 1.],
          [1., 1.]],
         [[1., 1.],
          [1., 1.]],
         [[1., 1.],
          [1., 1.]]]])
 本例中Batch为1 可以理解为其他Batch为0，所以所有每个元素都为1

import torch
x = torch.arange(12, dtype=torch.float32).reshape(2, 3, 1, 2)
print(x)
y = torch.nn.Softmax(dim=0)(x)
print(y)

tensor([[[[ 0.,  1.]],
         [[ 2.,  3.]],
         [[ 4.,  5.]]],
         
        [[[ 6.,  7.]],
         [[ 8.,  9.]],
         [[10., 11.]]]])
tensor([[[[0.0025, 0.0025]],
         [[0.0025, 0.0025]],
         [[0.0025, 0.0025]]],

        [[[0.9975, 0.9975]],
         [[0.9975, 0.9975]],
         [[0.9975, 0.9975]]]])
Batch为2，对于每个Batch对应元素和为 1 归一化

softmax(dim=1)
对第二个维度进行归一化，即对Channel进行归一化

import torch
x = torch.arange(12, dtype=torch.float32).reshape(1, 3, 2, 2)
print(x)
y = torch.nn.Softmax(dim=1)(x)
print(y)

tensor([[[[ 0.,  1.],
          [ 2.,  3.]],
         [[ 4.,  5.],
          [ 6.,  7.]],
         [[ 8.,  9.],
          [10., 11.]]]])
tensor([[[[3.2932e-04, 3.2932e-04],
          [3.2932e-04, 3.2932e-04]],
         [[1.7980e-02, 1.7980e-02],
          [1.7980e-02, 1.7980e-02]],
         [[9.8169e-01, 9.8169e-01],
          [9.8169e-01, 9.8169e-01]]]])
共有3个通道，对于每个通道中对应位置归一化，如0,4,8------1/(1+e^4+e^8) = 3.2932e-0.4

import torch
x = torch.arange(12, dtype=torch.float32).reshape(1, 3, 2, 2)
print(x)
y = torch.nn.Softmax(dim=2)(x)
print(y)

softmax(dim = 2 / 3 / -1)

tensor([[[[ 0.,  1.],
          [ 2.,  3.]],
         [[ 4.,  5.],
          [ 6.,  7.]],
         [[ 8.,  9.],
          [10., 11.]]]])
tensor([[[[0.1192, 0.1192],
          [0.8808, 0.8808]],
         [[0.1192, 0.1192],
          [0.8808, 0.8808]],
         [[0.1192, 0.1192],
          [0.8808, 0.8808]]]])
dim=2 这个时候对应着H--每个Batch中每个通道的所有行进行归一化 0,2  或者4,6
同理 dim = 3 对应着W--每个Batch中每个通道的所有列进行归一化 0,1  或者4,5
dim = -1对应着最后一个维度 即dim=3