Pytorch的torch.nn.functional
中有一个softmax方法,通常传入一个tensor和dim(即softmax所作用的维度,默认为0),如一个简单的例子:
import torch
import torch.nn.functional as F
# Example1
t = torch.Tensor([1, 2, 3])
F.softmax(t)
# Out: tensor([0.0900, 0.2447, 0.6652])
实际上这个softmax的计算方式如下,和平常学到的softmax不同的是,会减去一个偏移量shift
回到Example1
,它的计算方式等价于:
import math
ef = lambda x: math.pow(math.e, x)
def soft_fn(x):
max_x = max(x)
soft = [ef(x[i] - max_x) for i in range(x.shape[0])]
soft_sum = sum(soft)
return [x / soft_sum for x in soft]
soft_fn(t)
# Out: [0.09003057317038046, 0.24472847105479764, 0.6652409557748218]
上面是一个一维的情况,看一个复杂的例子:
a = torch.Tensor([[[1,6,2,10], [3,5,7,5], [2,6,3,4]], [[6,2,4,7], [6,3,2,6], [7,2,6,0]]])
a.shape # Out: torch.Size([2, 3, 4])
# tensor([[[ 1., 6., 2., 10.],
# [ 3., 5., 7., 5.],
# [ 2., 6., 3., 4.]],
# [[ 6., 2., 4., 7.],
# [ 6., 3., 2., 6.],
# [ 7., 2., 6., 0.]]])
F.softmax(a, dim=2)
# tensor([[[1.2114e-04, 1.7978e-02, 3.2928e-04, 9.8157e-01],
# [1.4209e-02, 1.0499e-01, 7.7580e-01, 1.0499e-01],
# [1.5219e-02, 8.3095e-01, 4.1371e-02, 1.1246e-01]],
# [[2.5827e-01, 4.7304e-03, 3.4953e-02, 7.0205e-01],
# [4.8353e-01, 2.4074e-02, 8.8563e-03, 4.8353e-01],
# [7.2699e-01, 4.8984e-03, 2.6745e-01, 6.6293e-04]]])
F.softmax(a, dim=2)[0, 0]
# Out: tensor([1.2114e-04, 1.7978e-02, 3.2928e-04, 9.8157e-01])
soft_fn(a[0, 0]
# Out: [0.0001211355434547465, 0.01797810868372637, 0.0003292805465535484, 0.9815714752262653]