只需要最后一层为softmax即可
以lenet5为例
import torch.nn as nn
from collections import OrderedDict
class LeNet5(nn.Module):
"""
Input - 1x32x32
C1 - 6@28x28 (5x5 kernel)
tanh
S2 - 6@14x14 (2x2 kernel, stride 2) Subsampling
C3 - 16@10x10 (5x5 kernel, complicated shit)
tanh
S4 - 16@5x5 (2x2 kernel, stride 2) Subsampling
C5 - 120@1x1 (5x5 kernel)
F6 - 84
tanh
F7 - 10 (Output)
"""
def __init__(self):
super(LeNet5, self).__init__()
self.convnet = nn.Sequential(OrderedDict([
('c1', nn.Conv2d(1, 6, kernel_size=(5, 5))),
('relu1', nn.ReLU()),
('s2', nn.MaxPool2d(kernel_size=(2, 2), stride=2)),
('c3', nn.Conv2d(6, 16, kernel_size=(5, 5))),
('relu3', nn.ReLU()),
('s4', nn.MaxPool2d(kernel_size=(2, 2), stride=2)),
('c5', nn.Conv2d(16, 120, kernel_size=(5, 5))),
('relu5', nn.ReLU())
]))
self.fc = nn.Sequential(OrderedDict([
('f6', nn.Linear(120, 84)),
('relu6', nn.ReLU()),
('f7', nn.Linear(84, 10)),
('sig7', nn.Softmax(dim=-1))
]))
def forward(self, img):
output = self.convnet(img)
output = output.view(img.size(0), -1)
output = self.fc(output)
return output
但是大多数会用LogSoftmax, 说是比softmax稳定,
('sig7', nn.LogSoftmax(dim=-1))
这样的话, 我们如何得到[0,1]的结果并且相加为1呢?
参考这里:https://discuss.pytorch.org/t/cnn-results-negative-when-using-log-softmax-and-nll-loss/16839
Since you are using the logarithm on softmax, you will get numbers in [-inf, 0], since log(0)=-inf and log(1)=0. You could get the probabilities back by using torch.exp(output).
我们只需要使用torch.exp来还原即可, 看看还原的结果
tensor([3.6305e-09, 4.1472e-10, 1.7295e-07, 2.2230e-06, 5.6530e-08, 1.1103e-08,
2.7662e-14, 9.9999e-01, 3.7742e-07, 5.3025e-06], grad_fn=<ExpBackward>)