BASNet: Boundary Aware Salient Object Detection学习笔记

最新推荐文章于 2023-10-19 21:52:35 发布

野生小菜鸡

最新推荐文章于 2023-10-19 21:52:35 发布

阅读量700

点赞数 1

本文链接：https://blog.csdn.net/weixin_43266423/article/details/106224030

版权

在这里插入图片描述

网络结构

在这里插入图片描述

论文创新点

它由一个深度监督的编码-解码器和一个残差细化模块组成。

深度监督解编码器(sup1~sup8，可见前图）
额外的残差细化模块（RRM）
使用了一种混合了BCE、SSIM（结构相似性）、IOU三种损失的混合损失

A novel boundary-aware salient object detection net- work: BASNet, which consists of a deeply supervised encoder-decoder and a residual refinement module, •
A novel hybrid loss that fuses BCE, SSIM and IoU to supervise the training process of accurate salient object prediction on three levels: pixel-level, patch-level and map-level,
A thorough evaluation of the proposed method that in-cludes comparison with 15 state-of-the-art methods on six widely used public datasets. Our method achieves state-of-the-art results in terms of both regional and boundary evaluation measures.

损失函数

损失函数由三部分组成。即Hybrid Loss
第一个是普通的交叉熵BCE loss，代表着pixel-level的监督。
BCE loss is the most widely used loss in binary classification and segmentation.
在这里插入图片描述
第二个则是使用了结构相似性指标SSIM loss，这个指标利用的是区域数据进行的计算，最后计算一个均值。
文中结构相似性损失表达如下：

SSIM loss is a patch-level measure, which considers a
local neighborhood of each pixel. It assigns higher weights
to the boundary, i.e., the loss is higher around the boundary,
even when the predicted probabilities on the boundary and
the rest of the foreground are the same. In the beginning
of training, the loss along the boundary is the largest (see
second row of Fig. 5). It helps the optimization to focus on
the boundary. As the training progresses, the SSIM loss of
the foreground reduces and the background loss becomes
the dominant term. However, the background loss does not
contribute to the training until when the prediction of back-
ground pixel becomes very close to the ground truth, where
the loss drops rapidly from one to zero. This is helpful since
the prediction typically goes close to zero only late in the
training process where BCE loss becomes flat. The SSIM
loss ensures that there’s still enough gradient to drive the
learning process. The background prediction looks cleaner
since the probability is pushed to zero.

SSIM损失是一种 patch-level measure，它考虑每个像素的局部邻域。它为边界分配较高的权重，即边界附近的损失较高，即使边界上的预测概率和前景的其余部分相同。在训练开始时，沿边界的损失是最大的（见图5的第二行）。它有助于优化专注于边界。随着训练的进行，前景的SSIM损失减少，背景损失成为主导词。然而，背景损失对训练没有贡献，直到背景像素的预测变得非常接近真值，其中损失快速从1下降到0。这是有帮助的，因为预测通常在训练过程的后期接近于零，其中BCE损失变得平坦。SSIM损失确保仍有足够的梯度来推动学习过程。由于概率被推到零，因此背景预测看起来更清晰。

第三个是使用的 IoU loss

文中使用的IOU损失如下：
在这里插入图片描述
S中的数据是显著性概率，而G中是只有0/1。

实验细节

We train our network using the DUTS-TR dataset, which has 10553 images. Before training, the dataset is augmented by horizontal flipping to 21106 images. During training, each image is first resized to 256×256 and randomly cropped to 224×224.Part of the encoder parameters are initialized from the ResNet-34 model [16]. Other convolutional layers are initialized by Xavier [10].
We utilize the Adam optimizer [26] to train our network and its hyper parameters are set to the default values, where the initial learning rate lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight decay=0. We train the network until the loss converges without using validation set. The training loss converges after 400k iterations with a batch size of 8 and the whole training process takes about 125 hours. During testing, the input image is resized to 256×256 and fed into the network to obtain its saliency map.
Then, the saliency map (256×256) is resized back to the original size of the input image. Both the resizing processes use bilinear interpolation

这里的relaxF是文章中用来评估边界的一个指标。首先使用0.5阈值将预测结果二值化，通过使用膨胀技术处理该结果后得到的结果与原始二值图异或操作，获得对应的边界图，类似可以获得真值的边界图。从而计算Fmeasure。注意，这里的计算Fmeasure的时候，使用的p=relaxPrecision和r=relaxRecall有些特殊处理：relaxPrecision定义为预测边界像素中，位于真值边界像素的ρ个像素范围内的像素比例；relaxRecall定义为真值边界像素中，位于预测边界像素的ρ个像素范围内的像素比例。实验中设定ρ=3。
在这里插入图片描述
与其它15种方法在6个数据集上的最大F-测度maxFβ（越大越好）、松弛边界F-测度relaxFbβ（越大越好）和M-AE（越小越好）的比较。红色、绿色和蓝色表示最佳、次优和再次优性能。“+”表示通过CRF后处理获得结果。“DT”、“MK”、“MB”分别是训练数据集DUTS-TR、MSRA10K、MSRA-B。C2S中使用的“M30K”是MSRA10K的扩展数据集。
在这里插入图片描述

野生小菜鸡

关注

1
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
BASNet: Boundary Aware Salient Object Detection学习笔记

网络结构论文创新点它由一个深度监督的编码-解码器和一个残差细化模块组成。深度监督解编码器(sup1~sup8，可见前图）额外的残差细化模块（RRM）使用了一种混合了BCE、SSIM（结构相似性）、IOU三种损失的混合损失A novel boundary-aware salient object detection net- work: BASNet, which consists of a deeply supervised encoder-decoder and a residual.
复制链接

扫一扫