1、softmax计算公式
2、pytorch计算结果
import torch
import torch.nn as nn
import numpy as np
n1 = np.array([[1, 2, 3], [1000, 0, -1000]], dtype=np.float32)
m = nn.Softmax(dim=1)
out = m(torch.from_numpy(n1))
print(out)
执行后输出结果:
tensor([[0.0900, 0.2447, 0.6652],
[1.0000, 0.0000, 0.0000]])
3、Python计算结果
简单实现:
import torch
import torch.nn as nn
import numpy as np
import math
n1 = np.array([[1, 2, 3], [1000, 0, -1000]], dtype=np.float32)
def softmax(inp):
for b in inp:
tmp = [math.exp(i) for i in b]
s = sum(tmp)
print([i/s for i in tmp])
softmax(n1)
结果:
[0.09003057317038046, 0.24472847105479767, 0.6652409557748219]
Traceback (most recent call last):
File "softmax.py", line 20, in <module>
softmax(n1)
File "softmax.py", line 16, in softmax
tmp = [math.exp(i) for i in b]
File "softmax.py", line 16, in <listcomp>
tmp = [math.exp(i) for i in b]
OverflowError: math range error
从结果可以看出,第一行和pytorch中结果一致,计算第二行时,提示溢出了,因为exp(1000)太大了。
softmax计算公式中,分子分母同时除以exp(max(xi)),结果不变,可以解决溢出问题。
代码略微修改:
import torch
import torch.nn as nn
import numpy as np
import math
n1 = np.array([[1, 2, 3], [1000, 0, -1000]], dtype=np.float32)
def softmax(inp):
for b in inp:
maxi = max(b)
tmp = [math.exp(i-maxi) for i in b]
s = sum(tmp)
print([i/s for i in tmp])
softmax(n1)
结果正常:
[0.09003057317038046, 0.24472847105479764, 0.6652409557748218]
[1.0, 0.0, 0.0]