【CV基石】Softmax and CrossEntropy

Softmax

Softmax 函数接收一个这N维向量(或者MxN维的数组,M代表样本数,N代表类别数)作为输入,然后把每一维的值转换成(0,1)之间的一个实数,公式如下:
p i = e a i ∑ k = 1 N e a k p_{i}=\frac{e^{a_{i}}}{\sum_{k=1}^{N} e^{a_{k}}} pi=k=1Neakeai
为保持数值稳定,避免出现nan情况,一般对输入向量做归一化处理,数值稳定的Softmax 公式如下:
p i = e a i − m a x ( a ) ∑ k = 1 N e a k − m a x ( a ) p_{i}=\frac{e^{a_{i}-max(a)}}{\sum_{k=1}^{N} e^{a_{k}-max(a)}} pi=k=1Neakmax(a)eaimax(a)
Softmax 函数的导数如下所示:
∂ p j ∂ a j = { p i ( 1 − p j )  if  i = j − p j ⋅ p i  if  i = ̸ j \frac{\partial p_{j}}{\partial a_{j}}=\left\{\begin{array}{ll}{p_{i}\left(1-p_{j}\right)} & {\text { if } i=j} \\ {-p_{j} \cdot p_{i}} & {\text { if } i =\not j}\end{array}\right. ajpj={pi(1pj)pjpi if i=j if i≠j

CrossEntropy

CrossEntropy(交叉熵)通常作为Softmax分类的损失函数,即通常所说的交叉熵损失函数。交叉熵损失函数体现了模型输出的概率分布和真实样本的概率分布的相似程度。其定义公式如下,其中 y i y_{i} yi代表One-hot 编码的标签:
L = H ( y , p ) = − ∑ i y i log ⁡ ( p i ) L=H(y, p)=-\sum_{i} y_{i} \log \left(p_{i}\right) L=H(y,p)=iyilog(pi)
交叉熵损失函数的导数如下所示:
∂ L ∂ o i = p i − y i \frac{\partial L}{\partial o_{i}}=p_{i}-y_{i} oiL=piyi


Python实现代码如下:

# -*- coding: UTF-8 -*-
import numpy as np 

def softmax(X):
    """Compute the softmax of output from classification layer.
    Parameters
    ----------
    X: list.
        A array of M x N. M is the number of samples and N is the number of categories.
    Returns
    -------
    rst: list.
        The result of softmax.
    """
    exps = np.exp(X-np.max(X, axis = 1).reshape(-1, 1))
    rst = exps/np.sum(exps,axis=1).reshape(-1,1)
    return rst

def cross_entropy(X,y):
    """Compute the cross entropy loss and grad of output from softmax.
    Parameters
    ----------
    X: list.
        A array of M x N. M is the number of samples and N is the number of categories.
    y: list.
        A array of M x 1. The value is the label of GT.
    Returns
    -------
    loss: float.
        The loss of prediction and label.
    grad: list.
        The gradient produced by each element of the predicted value.
    """
    m = len(y)
    p = softmax(X)
    log_likelihood = -np.log(p[range(m), y])
    loss = np.sum(log_likelihood) / m

    grad = p
    grad[range(m), y] -= 1
    grad = grad / m
    return loss, grad

def main():
    X = [[0.1, 1.5, -0.3, 2.2, 0.7],
         [1.0, -2.3, 5.2, -0.1, 2.9],
         [-3.5, -1.1, 3.7, 0.2, 2.6]]
    y = [3,4,2]
    rst = softmax(X)
    print('softmax rst:\n',rst)
    print('softmax check:\n',rst.sum(axis=1).reshape(-1,1))
    loss, grad = cross_entropy(X,y)
    print('loss:',loss)
    print('grad:',grad)

if __name__ == "__main__":
    main()

参考

Softmax和交叉熵的深度解析和Python实现


Technical Exchange

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值