softmax函数

最新推荐文章于 2024-05-18 13:01:50 发布

白鸥何处去

最新推荐文章于 2024-05-18 13:01:50 发布

阅读量1k

点赞数 1

文章标签：深度学习 python

本文链接：https://blog.csdn.net/xiaocainiao521521/article/details/119532260

版权

对于softmax函数，我把它单独拿出来使用时意外出现了许多bug，所以把在网上找到的几种编程代码都测试一下，对比看看。

一维情况

对于一维的向量很简单，可以直接使用如下编程代码：

def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0)

二维情况

考虑到二维数组的情况，softmax函数的编写有以下三种：（前两种编程代码参考链接https://www.cnblogs.com/programmerwang/p/14772552.html）

第一种

def softmax2_1(x):
    print("初始：\n",x)
    print("x的维度",x.ndim)
    if x.ndim == 2:
        x = x.T
        print("转置后的x：\n",x)
        x = x - np.max(x, axis=0)
        print("进行了防止溢出操作后的：\n",x)
        y = np.exp(x) / np.sum(np.exp(x), axis=0)
        print("最后结果:\n",y.T)
        return y.T

    x = x - np.max(x) # 溢出对策
    print("一维情况下进行防溢出操作后的：\n",x)
    print("最后的结果：",np.exp(x) / np.sum(np.exp(x)))
    return np.exp(x) / np.sum(np.exp(x))

x1 = np.array([[1.,3.3,2.5],[2.1, 3.2 , 5.3]])

x2 = np.array([1.0, 2.2, 3.3])

x3 = np.array([[1.0, 2.2, 3.3]])

x4 = np.array([[1.0], [2.2], [3.3]])

实际情况不可能有这种输入，因为这种不属于多分类任务，此处只是测试代码

第二种

def softmax2_2(x):
    print("初始：\n", x)
    print("x的维度", x.ndim)
    temp = np.max(x,axis = 1)
    #print(temp.shape)
    temp = temp.reshape(temp.size,1)

    x = x - temp
    print("进行防溢出操作后的x：\n",x)
    temp2 = np.sum(np.exp(x),axis = 1)
    print(temp2)
    temp2 = temp2.reshape(temp2.size, 1)
    y = np.exp(x)/temp2
    print("最后结果：\n",y)
    return y

x1 = np.array([[1.,3.3,2.5],[2.1, 3.2 , 5.3]])

x2 = np.array([1.0, 2.2, 3.3])

#运行错误：numpy.AxisError: axis 1 is out of bounds for array of dimension 1

x3 = np.array([[1.0, 2.2, 3.3]])

x4 = np.array([[1.0], [2.2], [3.3]])

第三种

第三种是错误的代码：

def softmax2_3(x):
    print("初始：\n", x)
    print("x的维度", x.ndim)
    max = np.max(x, axis = 1)
    print("获取每行的最大值：",max,"该矩阵的维度",max.shape)
    y = x - max
    return np.exp(y)/np.sum(np.exp(y), axis=1)

x1 = np.array([[1.,3.3,2.5],[2.1, 3.2 , 5.3]])

运行错误：ValueError: operands could not be broadcast together with shapes (2,3) (2,)

一般来说，输入x为n行m列矩阵，取每列最大值所得矩阵为（n，）二者无法进行矩阵的运算。

注意：在Numpy中，矩阵与向量相加减时，首先要求即要求矩阵的列数与向量的维数相等。然后就是矩阵的每一行与向量相加减，得出结果。如下：

>>> x = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
>>> print(x)  #(4,3)
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
>>> y = np.array([1,2,3])
>>> print(x-y) 
[[0 0 0]
 [3 3 3]
 [6 6 6]
 [9 9 9]]
>>> z = np.array([1,2,3,4])
>>> print(x-z)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (4,3) (4,)

参考链接：有关softmax函数代码实现的思考 - CuriosityWang - 博客园 (cnblogs.com)

softmax函数的实现-解析为什么使用矩阵转置的方式_w199611027017的博客-CSDN博客

白鸥何处去

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
3
评论
softmax函数

对于softmax函数，我把它单独拿出来使用时意外出现了许多bug，主要由于我对于向量与数组的区别没有搞清楚，所以把在网上找到的几种编程代码都测试一下，对比看看。对于一维的向量很简单，可以直接使用如下编程代码：def softmax(x): e_x = np.exp(x - np.max(x)) return e_x / e_x.sum(axis=0)考虑到二维数组的情况，softmax函数的编写有以下三种：（前两种编程代码参考链接https://www.cnblogs.co.
复制链接

扫一扫