【Few-Shot Segmentation论文阅读笔记】Part-aware prototype for few-shot semantic Segmentation, ECCV, 2020

最新推荐文章于 2024-02-02 19:17:08 发布

RaymondLove~

最新推荐文章于 2024-02-02 19:17:08 发布

阅读量2.1k

点赞数 2

本文链接：https://blog.csdn.net/Emma_Love/article/details/112490533

版权

Few-Shot Segmentation 专栏收录该内容

4 篇文章 4 订阅

订阅专栏

Abstract

问题：

现有Few-shot segmentation方法的缺点包括：

只能处理有限的问题：one-way few-shot segmentation, 比较难向multi-way进行扩展
Single prototype 表征能力有限，无法涵盖object的全部regions

目标：

针对上述问题，本文引入semi-supervised framework，将其作为semi-supervised few-shot semantic segmentation问题，从两方面入手enrich the prototype representations of each semantic class:

将holistic class prototype representation分解为一组part-aware prototypes，进而capture diverse and fine-grained(细粒度的) object features，以更好地涵盖和表示object regions。
利用大量的unlabeled images作为支持集的补充，从unlabeled和labeled图片中钟提取prototypes, 丰富其prototype的表征能力。

方法：

为了实现上述目标，本文提出了算法Part-aware prototype network (PPNet), 该算法有三部分组成：

An embedding network: 用来提取support set (unlabeled + labeled images)和query set的feature maps.
A prototypes generation network: 为每个类别生成一组具有可以区分性的part-aware prototypes.
A part-aware mask generation network: 生成semantic mask prediction on a query image.

结果：

在PASCAL-5i和COCO-20i两个数据集上取得了更好的效果，优于现有方法
本文提出的算法，可以既可以应用到one-way few-shot segmentation，也可以应用到multi-way few-shot segmentation问题。

代码地址： https://github.com/Xiangyi1996/PPNet-PyTorch

Comments:

优点：

本文是第一个将unlabeled data应用于few-shot segmentation task的文章

(the first to leverage the unlabeled data in the few-shot segmentation task)

本文提出了一种灵活的，基于prototype的小样本语义分割算法，在one-way和multi-way小样本语义分割问题上都取得了更好的结果
本文提出了一种part-aware prototype representation for semantic class, 能够提取更细粒度的特征用于语义分割
为了capture intra-class variation, 利用unlabeled data进行半监督学习，计算prototype by GNN.

缺点：

虽然作者说自己的算法优于过去的算法，但实际上，在one-way one-shot segmentation问题上，效果并不好，而且作者也没有给出原因。

1. Problem Setting 问题定义

本文采用meta-learning strategy, 定义M为meta-learner, 存在a family of few-shot segmentation tasks，记为: 𝓣={𝑻}, 𝒯是从an underlying task distribution $P_T$ 中采用得到的。

每个Task T (也叫episode), 其数据集由support set + query set组成，

Support set: 本文中包括两部分，labeled + unlabled数据，记为 $S=\left \{ S^l, S^u \right \}$ 。
对于c-way k-shot问题而言，即：每个task涉及到C个类别，每个类别涉及K个样本
- Labeled set:
- Unlabeled set:
Query set: 记为
注意:
- Q中的image also from the class set: $C_T$
- 在训练集上有标签，测试集上无标签

训练集和测试集：

注意： $C^{tr}$ 和 $C^{te}$ 没有交集！

2. The proposed methods

Main Idea: capture the intra-class (类内) variation and fine-grained features of semantic classes by a set of part-aware prototypes for each class, and additionally utilizing unlabeled data to enrich their representations.

模型组成：三个网络 + 一个Semantic branch组成：

Embedding network：提取feature maps for support and query images
Prototype generation network：从labeled和unlabeled support images中提取a set of part-aware prototypes.
- 组成模块：part generation module + part refinement module
Part-aware mask generation network：用于生成the final semantic prediction for the query images.
Semantic branch: 用于generate mask predictions over the global semantic class space $C^{tr}$

2.1 Embedding network, 记为 $f_{em}$

目的：计算feature maps

结构：Following prior work [36, 35]，使用ResNet[12]；使用dilated convolution，enlarge the receptive field and preserve spatial details.

计算过程：

2.2 Prototypes Generation network

目的：为每个类别生成对应的一组part-aware prototypes.

输入：和 $F^u$ $𝐹^𝑢$ , k代表当前类别

输出：a set of part-aware prototypes $P_k$

组成模块：

Prototype generation network: 基于labeled数据集生成a set of initial part-ware prototypes, 并为其添加global context of the semantic class

Part Refinement with unlabeled data: 通过引入unlabeled support images来enrich the prototypes，使其能够更好地捕获intra-class variations of each semantic class.

2.3 Part-aware mask generation network

2.4 Model Training with Semantic Regularization

3. Experiment

数据集： PASAL-5i [3] + COCO-20i [33, 20]

3.1 Experimental Configuration

Network:

ResNet[12] pretrained on ILSVC [25]作为feature extractor
- 输入图像被resize为[417, 417]
- 使用horizontal random flipping来做data augmentation
- Part-aware prototype network: 𝑁𝑝=5, 𝑁𝑟=100, 𝜎=0, 𝜆𝑝=0.8, 𝜆𝑟=0.2

Training Setting:

SGD, initial learning rate = 5e-4, weight decay = 1e-4, momentum = 0.9, 最大迭代次数=24K

Decay the learning rate 10 times in 10K, 20K respectively.

𝐿𝑠𝑒𝑚的权重=0.5

Baseline & Evaluation Metrics:

PANet[33]作为baseline method

对比方法：[3, 37, 22, 33, 35], [27, 20, 36], [29]

评估标准：mean-IoU (本文关注), binary-IoU

3.2 Experiments on PASCAL-5i [11] + [3, 37]

共有20个类别，分为4folds，每个fold有5个类别。

定量结果分析：Table 1 和 Tabel 2

Table 1: 1-way 1-shot & 1-way 5-shot
Table 2: multi-way setting (2-way 1-shot and 2-way 5-shot)

定性分析：Fig 3 (1-way 1-shot setting)

5.3 Experiments on Coco-20i [33, 20]

分为4-fold, 每组20类。

本文的划分有两种，分别记为:

Split-A, 参考[33], 本文关注这个
Split-B, 参考[20]

模型在three folds上训练，另外一个fold作为验证集，进行交叉验证

定量结果：

3.4 Ablation Study

在COCO-20i上使用split-A, 进行1-way 1-shot learning

Part-aware prototypes (PAP): 说明global semantic is important for part-level representation
Semantic branch (SEM): can improve the convergence and the final performance significantly.
Unlabeled data (UD): GNN is useful
Hyper-parameters: 𝑁𝑝=5, 𝑁𝑢=6, 𝛽=0.5

4. Introduction & Related Work

基于DL的语义分割问题往往依赖于大量的标注数据，但是获取标注数据是非常耗时耗力的，常用的解决方法：

Weak supervision [15], 2017, ICLR: Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR) (2017)

目前，小样本语义分割问题受到了广泛的关注:

Matching-based methods:

相关方法：
- [21], 2018: Rakelly, K., Shelhamer, E., Darrell, T., Efros, A.A., Levine, S.: Few-shot segmen- tation propagation with guided networks. arXiv preprint (2018)
- [37], 2018 SG-One: Zhang, X., Wei, Y., Yang, Y., Huang, T.: Sg-one: Similarity guidance network for one-shot semantic segmentation. arXiv preprint arXiv (2018)
- [36], 2019, CVPR: Zhang, C., Lin, G., Liu, F., Yao, R., Shen, C.: Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR) (2019)
- [35], 2019, ICCV: Zhang, C., Lin, G., Liu, F., Guo, J., Wu, Q., Yao, R.: Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV) (2019)
- [20], 2019, ICCV: Nguyen, K., Todorovic, S.: Feature weighting and boosting for few-shot segmentation. In: Proceedings of the IEEE International Conference on Computer Vi- sion(ICCV) (2019)
- [3], 2017, BMCV: Boots, Z.L.I.E.B., Shaban, A., Bansal, S.: One-shot learning for semantic segmen- tation. British Machine Vision Conference(BMVC) (2017)
- [22], 2018: Rakelly, K., Shelhamer, E., Darrell, T., Efros, A., Levine, S.: Conditional networks for few-shot semantic segmentation (2018)
上述方法缺点：only focus on one-way few-shot segmentation, and computationally expensive to generalize to the multi-way setting.

Prototype-based methods: conduct pixel-wise matching on query images with holistic prototypes of semantic classes.

相关方法：
- [7], BMCV, 2018: Dong, N., Xing, E.: Few-shot semantic segmentation with prototype learning. In: British Machine Vision Conference(BMVC) (2018)
- [27], 2019: Siam, M., Oreshkin, B.: Adaptive masked weight imprinting for few-shot segmen- tation. arXiv preprint arXiv (2019)
- [33], 2019: Wang, K., Liew, J.H., Zou, Y., Zhou, D., Feng, J.: Panet: Few-shot image semantic segmentation with prototype alignment. arXiv preprint arXiv (2019)
上述方法缺点：仅仅使用a single holistic representation, 很难cope with diverse appearance in objects with different parts poses and subcategories.

Optimization-based methods:
- 相关方法：
  - [29], 2019, Tian, P., Wu, Z., Qi, L., Wang, L., Shi, Y., Gao, Y.: Differentiable meta-learning model for few-shot semantic segmentation. arXiv preprint arXiv (2019)

类别1和类别2的方法具有的共同缺点是：只利用了a small support set来提取信息，限制了其捕获rich and fine-grained feature variant的能力。

其他的Related Work:

Few-Shot Classification
- Metric learning based methods: [34], 2018, AAAI; [28], 2017, NIPS; [32], 2016, NIPS
- Optimization learning based methods: [23], 2016; [8], 2017
- Graph-neural network based methods: [9], 2017; [8], 2017
- 引入Semi-supervised learning的方法：
  - [24], 2018
  - [10], 2019, NIPS
  - [1], 2019
Graph Neural Networks
- [10], [26],
- [14]: semi-supervised + GNN
- [31]: GNN + attention
- [9]: GNN + few-shot image classification

Conclusion

本文提出了PPNet模型，首次在few-shot segmentation问题中引入semi-supervised framework，利用unlabled data来capture intra-class variation of the prototypes, 并结合GNN为每一类产生多个part-aware prototypes，取得了更好地结果。

RaymondLove~

关注

2
点赞
踩
17

收藏

觉得还不错? 一键收藏
2
评论
【Few-Shot Segmentation论文阅读笔记】Part-aware prototype for few-shot semantic Segmentation, ECCV, 2020

Abstract问题：现有Few-shot segmentation方法的缺点包括：只能处理有限的问题：one-way few-shot segmentation, 比较难向multi-way进行扩展 Single prototype 表征能力有限，无法涵盖object的全部regions目标：针对上述问题，本文引入semi-supervised framework，将其作为semi-supervised few-shot semantic segmentation问题，从两方面入手en
复制链接

扫一扫