Tensorflow 损失函数（loss function）及自定义损失函数（二）

本文链接：https://blog.csdn.net/limiyudianzi/article/details/80694614

我主要分三篇文章给大家介绍tensorflow的损失函数，本篇为tensorflow其他的损失函数，主要参照了tensorlayer 中的实现
（一）tensorflow内置的四个损失函数
（二）其他损失函数
（三）自定义损失函数

Tensorlayer封装了很多的已经写好的代码，同时作为一个开源项目，也公布了很多的代码片段，我们这就来看看，除了tensorflow内置的四个损失函数以外还有什么其他的损失函数吧。

本篇有三个目的，第一是让大家知道更多的损失函数的存在。第二是让大家知道已经有很多的平台给出了很多很优秀的损失函数的实现，大家可以直接借鉴学习。最后也是希望有自定义损失函数需求的人能够从“别人家的代码”中学到如何定义一个损失函数，学习经验为自己定义打下基础。

均方差loss

均方差loss，也叫做L2（2范数）损失函数，是机器学习中最为基础的一个损失函数了，大家在学习反向传播的时候，大多使用均方差loss作为最终的loss然后借助链式法则进行推导。这里我们展示Tensorlayer中的实现

def mean_squared_error(output, target, is_mean=False, name="mean_squared_error"):
    """Return the TensorFlow expression of mean-square-error (L2) of two batch of data.

    Parameters
    ----------
    output : Tensor
        2D, 3D or 4D tensor i.e. [batch_size, n_feature], [batch_size, height, width] or [batch_size, height, width, channel].
    target : Tensor
        The target distribution, format the same with `output`.
    is_mean : boolean
        Whether compute the mean or sum for each example.
            - If True, use ``tf.reduce_mean`` to compute the loss between one target and predict data.
            - If False, use ``tf.reduce_sum`` (default).

    References
    ------------
    - `Wiki Mean Squared Error <https://en.wikipedia.org/wiki/Mean_squared_error>`__

    """
    with tf.name_scope(name):
        if output.get_shape().ndims == 2:  # [batch_size, n_feature]
            if is_mean:
                mse = tf.reduce_mean(tf.reduce_mean(tf.squared_difference(output, target), 1))
            else:
                mse = tf.reduce_mean(tf.reduce_sum(tf.squared_difference(output, target), 1))
        elif output.get_shape().ndims == 3:  # [batch_size, w, h]
            if is_mean:
                mse = tf.reduce_mean(tf.reduce_mean(tf.squared_difference(output, target), [1, 2]))
            else:
                mse = tf.reduce_mean(tf.reduce_sum(tf.squared_difference(output, target), [1, 2]))
        elif output.get_shape().ndims == 4:  # [batch_size, w, h, c]
            if is_mean:
                mse = tf.reduce_mean(tf.reduce_mean(tf.squared_difference(output, target), [1, 2, 3]))
            else:
                mse = tf.reduce_mean(tf.reduce_sum(tf.squared_difference(output, target), [1, 2, 3]))
        else:
            raise Exception("Unknow dimension")
        return mse

这里我们可以看到一个好的代码注释是非常重要的，好的注释可以方便大家理解程序，并且快速上手。我们可以看到这里作者用判断语句讨论了输入不同维度的向量的情况，函数写的十分的完备，其中主要的计算核心就是tf.squared_difference（）这个函数，作为学习的一部分，我么也可以看到，如果将来我们遇到了更高维度的数据的时候，如三维的医学数据，那么计算均方差loss的时候我们就可以用:

 tf.reduce_mean(tf.reduce_sum(tf.squared_difference(output, target), [1, 2, 3, 4]))

增加tf.squared_difference（）指定的维度进行均方误差的计算了。

Dice coefficient 损失函数

Dice coefficient是常见的评价分割效果的方法之一，同样的也可以作为损失函数衡量分割的结果和标签之间的差距。同样的我们这里展示Tensorlayer的实现方法：

def dice_coe(output, target, loss_type='jaccard', axis=(1, 2, 3), smooth=1e-5):
    """Soft dice (Sørensen or Jaccard) coefficient for comparing the similarity
    of two batch of data, usually be used for binary image segmentation
    i.e. labels are binary. The coefficient between 0 to 1, 1 means totally match.

    Parameters
    -----------
    output : Tensor
        A distribution with shape: [batch_size, ....], (any dimensions).
    target : Tensor
        The target distribution, format the same with `output`.
    loss_type : str
        ``jaccard`` or ``sorensen``, default is ``jaccard``.
    axis : tuple of int
        All dimensions are reduced, default ``[1,2,3]``.
    smooth : float
        This small value will be added to the numerator and denominator.
            - If both output and target are empty, it makes sure dice is 1.
            - If either output or target are empty (all pixels are background), dice = ```smooth/(small_value + smooth)``, then if smooth is very small, dice close to 0 (even the image values lower than the threshold), so in this case, higher smooth can have a higher dice.

    Examples
    ---------
    >>> outputs = tl.act.pixel_wise_softmax(network.outputs)
    >>> dice_loss = 1 - tl.cost.dice_coe(outputs, y_)

    References
    -----------
    - `Wiki-Dice <https://en.wikipedia.org/wiki/Sørensen–Dice_coefficient>`__

    """
    inse = tf.reduce_sum(output * target, axis=axis)
    if loss_type == 'jaccard':
        l = tf.reduce_sum(output * output, axis=axis)
        r = tf.reduce_sum(target * target, axis=axis)
    elif loss_type == 'sorensen':
        l = tf.reduce_sum(output, axis=axis)
        r = tf.reduce_sum(target, axis=axis)
    else:
        raise Exception("Unknow loss_type")
    dice = (2. * inse + smooth) / (l + r + smooth)
    dice = tf.reduce_mean(dice)
    return dice

该loss多用于分割任务，我们可以看到在jaccard的形式中，作者通过tf内置的各种操作实现了对于loss的计算，公式如下： dice值计算公式
由此，如果我们提出了一种新的loss，很多时候只需要通过python调用tensorflow内置的函数就可以实现自定义的loss，而不用去修改底层的代码。