loss盘点: BCE loss —— binary_cross_entropy

氵文大师

于 2023-01-05 11:01:38 发布

阅读量588

点赞数 1

分类专栏： losses 文章标签： python numpy 深度学习

本文链接：https://blog.csdn.net/HaoZiHuang/article/details/128557918

版权

losses 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

我的 torch 版本: 1.8.1+cu111
我的 paddle 版本: 2.4.1

BCELoss 在 forward 中也是调用 nn.functional.binary_cross_entropy ，所以只看该函数即可

# Paddle 的 BCELoss 源码
class BCELoss(Layer):

    def __init__(self, weight=None, reduction='mean', name=None):
        if reduction not in ['sum', 'mean', 'none']:
            raise ValueError(
                "The value of 'reduction' in bce_loss should be 'sum', 'mean' or 'none', but "
                "received %s, which is not allowed." % reduction
            )

        super(BCELoss, self).__init__()
        self.weight = weight
        self.reduction = reduction
        self.name = name

    def forward(self, input, label): # 此处也是调用 binary_cross_entropy 来实现
        out = paddle.nn.functional.binary_cross_entropy(
            input, label, self.weight, self.reduction, self.name
        )
        return out

1. loss公式

输入 $\mathbf{x} = [x_1, x_2, ..., x_n]$ 共有 n 个分量，则 $p(x_i)$ 是 $\mathbf{x}$ 为第 $i$ 类的概率， $q(x_i)$ 是 $\mathbf{x}$ 为第 $i$ 类的预测概率

$\sum_{i=1}^{n} p(x_i) log(q(x_i))$

由于我们在二分类时只有两类，则：
$\begin{aligned} H(p, q) &= - \sum_{i=1}^{n} p(x_i) log(q(x_i)) \\ &= -p(x_i) * log(q(x_i)) - (1-p(x_i)) * log(1-q(x_i)) \end{aligned}$

将 $p(x_i)$ 视为 $l ab e l$ 值 $_ y_{\_}$ ， $q(x_i)$ 视为预测值 $\hat{y}$ ，也就是：
$\begin{aligned} H(p, q) &= - \sum_{i=1}^{n} p(x_i) log(q(x_i)) \\ &= -p(x_i) * log(q(x_i)) - (1-p(x_i)) * log(1-q(x_i)) \\ &=-y_{\_} * log(\hat{y}) - (1 - y_{\_} ) * log(1-\hat{y}) \end{aligned}$

$_ y_{\_}$ 是 $m$ 行 $n$ 列的 $0/1$ 矩阵， $\hat{y}$ 是 $m$ 行 $n$ 列的取值为 $0$ 到 $1$ 的矩阵

也就是文档中的内容:

$O u t = - 1 * (l ab e l * l o g (in p u t) + (1 - l ab e l) * l o g (1 - in p u t))$

而 $w e i g h t$ 也就是每个 batch 中元素的权重，或者说每一类的权重

2. 实验代码

torch 和 paddle 的 binary_cross_entropy 是对齐的，以下是实验代码，已经不调用API的手动计算代码

# -*- coding: utf-8 -*-
"""
Created on Wed Jan  4 22:36:50 2023

@author: Ryan
"""

import numpy as np
import torch
import paddle


# ----------- numpy 参数 -----------
np.random.seed(1107)

# 假设 bs=4, 7种(多分类)
np_logit = np.random.rand(4, 5).astype("float32") 
np_target = np.random.randint(2, size=(4, 5)).astype("float32")


# 给每个 batch 的元素 加权重
np_weight = np.random.randint(2, 4, size=(5,)).astype("float32")


# ----------- torch -----------
t_logit = torch.tensor(np_logit)
t_target = torch.tensor(np_target)
t_weight = torch.tensor(np_weight)
t_out = torch.nn.functional.binary_cross_entropy(t_logit, t_target,
                                                  weight=t_weight,
                                                 reduction='none')

# 手动计算
t_out_hand = t_target * torch.log(t_logit) + (1-t_target) * torch.log(1-t_logit)
t_out_hand *= -1
t_out_hand = t_out_hand * t_weight 


# ----------- paddle -----------
p_logit = paddle.to_tensor(np_logit)
p_target = paddle.to_tensor(np_target)
p_weight = paddle.to_tensor(np_weight)
p_out = paddle.nn.functional.binary_cross_entropy(p_logit, p_target, 
                                                   weight=p_weight,
                                                  reduction='none')

# 手动计算
p_out_hand = p_target * paddle.log(p_logit) + (1-p_target) * paddle.log(1-p_logit)
p_out_hand *= -1
p_out_hand = p_out_hand * p_weight