做一个目标检索的实验,发现最后一层特征后接一个softmax之后检索性能有所提升。写了一些代码验证:
import torch
import scipy
import scipy.spatial
import torch.nn.functional as F
A=torch.tensor([[13.0,2.0,5.0,3.0],[4.0,-1,8.0,6.0],[2.0,6.0,7.0,0],[5.0,8.0,6,-2],[1,12.0,5,0]])
B=scipy.spatial.distance.cdist(A,A,'cosine')
print('Distance before softmax:',B)
C=F.softmax(A,dim=1)
D=scipy.spatial.distance.cdist(C,C,'cosine')
print('Cosine distance after softmax:',D)
E=scipy.spatial.distance.cdist(C,C,'euclidean')
print('Euclidean distance after softmax:',E)
-------------------------------
Output:
Distance before softmax:
[[0.00000000e+00 3.06022082e-01 4.62172897e-01 3.57446533e-01
6.69491939e-01]
[3.06022082e-01 1.11022302e-16 4.31618336e-01 6.09290968e-01
7.73100997e-01]
[4.62172897e-01 4.31618336e-01 0.00000000e+00 6.67239058e-02
1.13850424e-01]
[3.57446533e-01 6.09290968e-01 6.67239058e-02 0.00000000e+00
1.15389724e-01]
[6.69491939e-01 7.73100997e-01 1.13850424e-01 1.15389724e-01
2.22044605e-16]]
Cosine distance after softmax:
[[0. 0.98151435 0.99335588 0.95066118 0.99996629]
[0.98151435 0. 0.06987087 0.86626091 0.9989731 ]
[0.99335588 0.06987087 0. 0.53226189 0.6538935 ]
[0.95066118 0.86626091 0.53226189 0. 0.01011525]
[0.99996629 0.9989731 0.6538935 0.01011525 0. ]]
Euclidean distance after softmax:
[[0. 1.31608983 1.26055004 1.28134947 1.41324882]
[1.31608983 0. 0.32360164 1.13687655 1.32723565]
[1.26055004 0.32360164 0. 0.84204801 1.03077042]
[1.28134947 1.13687655 0.84204801 0. 0.19676098]
[1.41324882 1.32723565 1.03077042 0.19676098 0. ]]
对一个矩阵做softmax并不会影响其大小 顺序,但是对于距离来说影响还是比较大的,可以看到softmax前后的大小顺序已经改变了。
softmax带来性能提升的原因可能是:softmax会拉近特征与相应类中心之间的距离并同时推开特征与不相关类中心的距离