医学图像分割与分类的基础模型 VISION-MAE

最新推荐文章于 2025-02-16 07:00:00 发布

Phoenixtree_DongZhao

最新推荐文章于 2025-02-16 07:00:00 发布

阅读量1.4k

点赞数 10

分类专栏： Large Model MyDLNote-Segmentation Medicine 文章标签：人工智能大语言模型 MAE

本文链接：https://blog.csdn.net/u014546828/article/details/141497400

版权

VISION-MAE: A Foundation Model for Medical Image Segmentation and Classification

2402.01034 (arxiv.org)

Abstract

Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availability, lack of generalizability, and the necessity to incorporate multimodal data effectively. A foundation model, which is a large-scale pre-trained AI model, offers a versatile base that can be adapted to a variety of specific tasks and contexts. Here, we present a novel foundation model, VISION-MAE, specifically designed for medical imaging. Specifically, VISION-MAE is trained on a dataset of 2.5 million unlabeled images from various modalities (CT, MR, PET, X-rays, and ultrasound), using self-supervised learning techniques. It is then adapted to classification and segmentation tasks using explicit labels. VISION-MAE has high label efficiency, outperforming several benchmark models in both in-domain and out-of-domain applications, and achieves high performance even with reduced availability of labeled data. This model represents a significant advancement in medical imaging AI, offering a generalizable and robust solution for improving segmentation and classification tasks while reducing the data annotation workload.

人工智能（AI）具有彻底改变医学成像中诊断和分割技术的潜力。然而，其开发和临床实施面临多重挑战，包括数据可用性有限、缺乏通用性，以及需要有效整合多模态数据。基础模型是一种大规模的预训练AI模型，提供了一个多功能的基础，可以适应各种特定任务和上下文。在此，本文介绍了一种专为医学成像设计的新型基础模型VISION-MAE。具体来说，VISION-MAE使用自监督学习技术，在包含250万张来自不同模态（CT、MR、PET、X光和超声）的未标记图像的数据集上进行训练。然后，它使用明确的标签适应于分类和分割任务。VISION-MAE具有高标签效率，在域内和域外应用中均优于多个基准模型，即使在标记数据可用性降低的情况下也能实现高性能。该模型代表了医学成像AI领域的重大进展，为改善分割和分类任务提供了一种通用且稳健的解决方案，同时减轻了数据标注的工作负担。