Title
题目
Labeled-to-unlabeled distribution alignment for partially-supervised multi-organ medical image segmentation
部分监督的多器官医学图像分割中的标注与未标注分布对齐
01
文献速递介绍
多器官医学图像分割(Mo-MedISeg)是医学图像分析领域的一个基础性但具有挑战性的研究任务。它涉及将每个图像像素分配一个语义器官标签,如“肝脏”、“肾脏”、“脾脏”或“胰腺”。任何未定义的器官区域被视为“背景”类别(Cerrolaza等,2019)。随着卷积神经网络(CNNs)(He等,2016)和视觉变换器(Vision Transformers)(Dosovitskiy等,2020;Zhang等,2020b,2024)等深度图像处理技术的发展,Mo-MedISeg已获得显著的研究关注,并广泛应用于各种实际应用中,包括诊断干预(Kumar等,2019)、治疗规划(Chu等,2013),以及计算机断层扫描(CT)图像(Gibson等,2018)和X射线图像(Gómez等,2020)等多种场景。然而,训练一个完全监督的深度语义图像分割模型具有挑战性,因为它通常需要大量像素级标注样本(Wei等,2016;Zhang等,2020a)。对于Mo-MedISeg任务来说,这一挑战更加艰巨,因为获取准确且密集的多器官标注既繁琐又耗时,并且需要稀缺的专家知识。因此,大多数基准公开数据集,如LiTS(Bilic等,2023)和KiTS(Heller等,2019),仅提供单个器官的标注,其他与任务无关的器官被标记为背景。与完全标注的多器官数据集相比,部分标注的多器官数据集更为容易获得。
为了减轻收集完整标注的负担,部分监督学习(PSL)被用来从多个部分标注的数据集中学习Mo-MedISeg模型(Zhou等,2019;Zhang等,2021a;Liu和Zheng,2022)。在这种设置下,每个数据集为单个器官类提供标签,直到涵盖所有感兴趣的前景类别。这种策略避免了需要一个密集标注的数据集,并且可以合并来自不同机构、标注不同器官类型的数据集,尤其是当不同医院关注不同器官时。使用PSL进行Mo-MedISeg的一大挑战是如何在不复杂化多器官分割模型的情况下,利用有限的标注像素和大量未标注像素。每个部分标注的数据集使用二进制图来标注特定器官,指示像素是否属于感兴趣的器官。其他前景器官和背景类别没有提供标签,导致训练集包括每个图像中的标注和未标注像素。先前的方法通常仅从标注像素中学习(Dmitriev和Kaufman,2019;Zhou等,2019;Zhang等,2021a)。
Aastract
摘要
Partially-supervised multi-organ medical image segmentation aims to develop a unified semantic segmentation model by utilizing multiple partially-labeled datasets, with each dataset providing labels for a single class of organs. However, the limited availability of labeled foreground organs and the absence of supervision to distinguish unlabeled foreground organs from the background pose a significant challenge, which leads to a distribution mismatch between labeled and unlabeled pixels. Although existing pseudo-labeling methods can be employed to learn from both labeled and unlabeled pixels, they are prone to performance degradation in this task, as they rely on the assumption that labeled and unlabeled pixels have the same distribution. In this paper, to address the problem of distribution mismatch, we propose a labeled-to-unlabeled distribution alignment (LTUDA) framework that aligns feature distributions and enhances discriminative capability. Specifically, we introduce a cross-set data augmentation strategy, which performs region-level mixing between labeled and unlabeled organs to reduce distribution discrepancy and enrich the training set. Besides, we propose a prototype-based distribution alignment method that implicitly reduces intra-class variation and increases the separation between the unlabeled foreground and background. This can be achieved by encouraging consistency between the outputs of two prototype classifiers and a linear classifier. Extensive experimental results on the AbdomenCT-1K dataset and a union of four benchmark datasets (including LiTS, MSD-Spleen, KiTS, and NIH82) demonstrate that our method outperforms the state-of-the-art partially-supervised methods by a considerable margin, and even surpasses t