论文 Learning to Segment Object Candidates

最新推荐文章于 2023-06-21 14:25:28 发布

deardao

最新推荐文章于 2023-06-21 14:25:28 发布

阅读量424

点赞数

分类专栏：深度学习，机器学习，detection，论文，人工智能文章标签：深度学习论文 detecting box

本文链接：https://blog.csdn.net/liangdaojun/article/details/73330523

版权

深度学习，机器学习，detection，论文，人工智能专栏收录该内容

11 篇文章 3 订阅

订阅专栏

9:40-11:35

这是一篇2015年的文章，是较早的一篇关于图像的像素分割的论文，但如果分割出来的像素做候选区域会不会大材小用了？因为需要像素级别的标注来监督学习，相比标注ground true box 的难度系数大很多。如果用它来做box proposal的话，对于像素分割的要求则不是太高了，因为网络最后都会对 proposal boxes 做回归的。

网络架构图如下：
这里写图片描述
网络架构：

和通常的detection任务一样，该architection是基于VGG的，输入为：3x244x244(按论文中记法),将VGG-A的最后一层 pooling去掉，输出为512x14x14，后接两个branch：
1、segmentation branch：接conv层降采样到 512x1x1,全连接到56x56的图像上，采用阈值>0.x（根据数据集）为1，bilinear upsamping到原图像尺寸， 实现 mask功能。

它不像deconv的功能，是h×w pixel classifiers功能，相当于一个多分类问题，判断一个像素是否属于该物体。use either locally or fully connected pixel classifiers，一个只能获取局部信息，一个有大量冗余的参数，其实是1x1的conv layer。the output of the classification layer to be h’×w’ with h’ < h and w’ < w and upsample the output to h × w to match the input dimensions。也就是bilinear upsamping。

2、scoring branch:max_pool_2x2下采样，接两层fc(全连接),最后输出一个实值。（output is a single ‘objectness’ score）


 文中的loss function 中的lambda=1/32，并没有做进一步的讨论，这个参数对像素的整体loss是敏感的。当socre=反例时，segmentation loss=0;


During full image inference, we apply the model densely at multiple locations and scales.(推理阶段，对每个位置运行一次模型 224/16=14（次）。产生多种scale，translate shift的mask box)。这样的缺点就是不是 single shot的，time-consuming。

deardao

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
论文 Learning to Segment Object Candidates

9:40-11:35这是一篇2015年的文章，是较早的一篇关于图像的像素分割的论文，但如果分割出来的像素做候选区域会不会大材小用了？因为需要像素级别的标注来监督学习，相比标注ground true box 的难度系数大很多。如果用它来做box proposal的话，对于像素分割的要求则不是太高了，因为网络最后都会对 proposal boxes 做回归的。网络架构图如下：
复制链接

扫一扫

专栏目录