SSD算法Tensorflow版详解（二）

最新推荐文章于 2024-08-12 14:50:42 发布

太阳花的小绿豆

最新推荐文章于 2024-08-12 14:50:42 发布

阅读量4.8k

点赞数 3

分类专栏： Tensorflow 编程开发

本文链接：https://blog.csdn.net/qq_37541097/article/details/80933788

版权

Tensorflow 同时被 2 个专栏收录

25 篇文章 58 订阅

订阅专栏

编程开发

8 篇文章 9 订阅

订阅专栏

Loss函数计算

SSD的Loss函数包含两项：（1）预测类别损失（2）预测位置偏移量损失：

Loss中的N代表着被挑选出来的默认框个数（包括正样本和负样本），L（los）即位置偏移量损失是Smooth L1 loss（是默认框与GTbox之间的位置偏移与网络预测出的位置偏移量之间的损失），L（conf）即预测类别损失是多类别softmax loss，α的值设置为1. Smooth L1 loss定义为：

L（los）损失函数的定义为：

根据函数定义我们可以看到L（los）损失函数主要有四部分：中心坐标cx的偏移量损失，中心点坐标cy的偏移损失，宽度w的缩放损失以及高度h的缩放损失。式中的l表示的是预测的坐标偏移量，g表示的是默认框与之匹配的GTbox的坐标偏移量。

L（conf）多类别softmax loss损失定义为：

根据函数定义我们可以看到L（conf）损失由两部分组成：正样本（Pos）损失和负样本（Neg）损失。

接下来我们来分析下ssd_vgg_300.py文件中的ssd_losses函数，需要注意的是负样本的选取（论文中Hard negative mining部分），什么是hard negative mining，主要是为了降低假阳性即背景被识别成目标，粘一段百度的回答：对于目标检测中我们会事先标记处ground truth，然后再算法中会生成一系列proposal，这些proposal有跟标记的ground truth重合的也有没重合的，那么重合度（IOU）超过一定阈值（通常0.5）的则认定为是正样本，以下的则是负样本。然后扔进网络中训练。However，这也许会出现一个问题那就是正样本的数量远远小于负样本，这样训练出来的分类器的效果总是有限的，会出现许多false positive，把其中得分较高的这些false positive当做所谓的Hard negative，既然mining出了这些Hard negative，就把这些扔进网络再训练一次，从而加强分类器判别假阳性的能力。

def ssd_losses(logits, localisations,  # logits预测类别  localisation预测偏移位置
               gclasses, glocalisations, gscores,  # gclasses正确类别  glocalisation实际偏移位置  gscores与GT的交并比
               match_threshold=0.5,
               negative_ratio=3.,
               alpha=1.,
               label_smoothing=0.,
               device='/cpu:0',
               scope=None):
    with tf.name_scope(scope, 'ssd_losses'):
        lshape = tfe.get_shape(logits[0], 5)
        num_classes = lshape[-1]
        batch_size = lshape[0]

        # Flatten out all vectors!  展平所有向量
        flogits = []
        fgclasses = []
        fgscores = []
        flocalisations = []
        fglocalisations = []
        for i in range(len(logits)):
            flogits.append(tf.reshape(logits[i], [-1, num_classes]))
            fgclasses.append(tf.reshape(gclasses[i], [-1]))
            fgscores.append(tf.reshape(gscores[i], [-1]))
            flocalisations.append(tf.reshape(localisations[i], [-1, 4]))
            fglocalisations.append(tf.reshape(glocalisations[i], [-1, 4]))
        # And concat the crap!
        logits = tf.concat(flogits, axis=0)
        gclasses = tf.concat(fgclasses, axis=0)
        gscores = tf.concat(fgscores, axis=0)
        localisations = tf.concat(flocalisations, axis=0)
        glocalisations = tf.concat(fglocalisations, axis=0)
        dtype = logits.dtype

        # Compute positive matching mask... 计算正样本数目
        pmask = gscores > match_threshold   # 交并比是否大于0.5
        fpmask = tf.cast(pmask, dtype)
        n_positives = tf.reduce_sum(fpmask)  # 正样本数目

        # Hard negative mining...
        no_classes = tf.cast(pmask, tf.int32)
        predictions = slim.softmax(logits)
        nmask = tf.logical_and(tf.logical_not(pmask),  # 交并比小于0.5并大于-0.5的负样本
                               gscores > -0.5)
        fnmask = tf.cast(nmask, dtype)  # 转成float型
        nvalues = tf.where(nmask,      # True时为背景概率，False时为1.0
                           predictions[:, 0],   # 0 是 background
                           1. - fnmask)
        nvalues_flat = tf.reshape(nvalues, [-1])
        # Number of negative entries to select.
        max_neg_entries = tf.cast(tf.reduce_sum(fnmask), tf.int32)  # 所有供选择的负样本数目
        n_neg = tf.cast(negative_ratio * n_positives, tf.int32) + batch_size
        n_neg = tf.minimum(n_neg, max_neg_entries)  # 负样本的个数

        val, idxes = tf.nn.top_k(-nvalues_flat, k=n_neg)  # 按顺序排获取前k个值，以及对应id
        max_hard_pred = -val[-1]  # 负样本的背景概率阈值
        # Final negative mask.
        nmask = tf.logical_and(nmask, nvalues < max_hard_pred)  # 交并比小于0.5并大于-0.5的负样本，且概率小于max_hard_pred
        fnmask = tf.cast(nmask, dtype)

        # Add cross-entropy loss.
        with tf.name_scope('cross_entropy_pos'):
            loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,
                                                                  labels=gclasses)
            loss = tf.div(tf.reduce_sum(loss * fpmask), batch_size, name='value')  # fpmask是正样本的mask，正1，负0
            tf.losses.add_loss(loss)

        with tf.name_scope('cross_entropy_neg'):
            loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,
                                                                  labels=no_classes)
            loss = tf.div(tf.reduce_sum(loss * fnmask), batch_size, name='value')  # fnmask是负样本的mask，负为1，正为0
            tf.losses.add_loss(loss)

        # Add localization loss: smooth L1, L2, ...
        with tf.name_scope('localization'):
            # Weights Tensor: positive mask + random negative.
            weights = tf.expand_dims(alpha * fpmask, axis=-1)
            loss = custom_layers.abs_smooth(localisations - glocalisations)
            loss = tf.div(tf.reduce_sum(loss * weights), batch_size, name='value')
            tf.losses.add_loss(loss)