论文代码复现之“真”AutoSAM (Tal Shaharbany version)

论文题目:AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt Encoder
作者:Tal Shaharbany (Tel Aviv University),* ‪Aviad Dahan‬‏ (Tel Aviv University), Raja Giryes (Tel Aviv University), Lior Wolf (Tel Aviv University, Israel)
摘要:The recently introduced Segment Anything Model (SAM) combines a clever architecture and large quantities of training data to obtain remarkable image segmentation capabilities. However, it fails to reproduce such results for Out-Of-Distribution (OOD) domains such as medical images. Moreover, while SAM is conditioned on either a mask or a set of points, it may be desirable to have a fully automatic solution. In this work, we replace SAM’s conditioning with an encoder that operates on the same input image. By adding this encoder and without further fine-tuning SAM, we obtain state-of-the-art results on multiple medical images and video benchmarks. This new encoder is trained via gradients provided by a frozen SAM. For inspecting the knowledge within it, and providing a lightweight segmentation solution, we also learn to decode it into a mask by a shallow deconvolution network. Our code is publicly available at
https://github.com/talshaharabany/AutoSAM
代码:https://github.com/talshaharabany/AutoSAM
视频:https://bmvc2022.mpi-inf.mpg.de/BMVC2023/0530_video.mp4
海报:https://bmvc2022.mpi-inf.mpg.de/BMVC2023/0530_poster.pdf
会议pdf:https://papers.bmvc2023.org/0530.pdf
arXiv pdf:http://arxiv.org/abs/2306.06370v1
会议链接:http://bmvc2022.mpi-inf.mpg.de/BMVC/ (British Machine Vision Conference)
会议github入口:https://britishmachinevisionassociation.github.io/
会议简介:The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).
文章引用:@inproceedings{Shaharbany_2023_BMVC,
first commit
author    = {Tal Shaharbany and ‪Aviad Dahan‬‏ and Raja Giryes and Lior Wolf},
title     = {AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt Encoder},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
Update
url       = {https://papers.bmvc2023.org/0530.pdf}
first commit
}

这个是我最看好的SAM微调了,尝试复现一下:

1、下载SAM checkpoint。

需要使用上网技巧从Google Drive下载三个文件:SAM_base SAM_large SAM_huge

实际上,这里的权重文件和SAM的github主页下的文件完全一样,其文件大小完全一样。此外,我使用算法计算了哈希值,也是相同的,并不非得从Google Drive上下载。

2、导入代码。

git clone https://github.com/your_name/AutoSAM.git

cd AutoSAM

3、新建conda环境

conda create --name autosam python=3.10

pip install -r requirements.txt

亲测,这步的requirements里面包含了很多不能pip的包,也不好单独摘出来。后面代码里用到哪个包现装就行,不要pip install了。

4、开始训练

python train.py

先在https://download.pytorch.org/whl/torch_stable.html下面安装python=3.10的cu102的torch=1.11.0和torchvision=0.12.0,然后安装tqdm, opencv-python (requirements.txt中安装的是最新版4.9.0.80,不需要编译,很快就装好了), pandas。

根据代码,需要将下载的权重文件链接到./cp文件夹:

ln -s ../downloads-tal-AutoSAM cp

根据代码,需要将GlaS数据集链接到./Warwick文件夹。

根据代码,默认运行的是vit_h,如果vit_h报错CUDA out of memory,可以将sam_args里的sam_checkpoint和model_type两项中的vit_h换成vit_b试试,应该就可以运行了。

这里只能单卡训练,多卡训练暂时不支持。

训练结果生成在./results下的文件夹里,包括:①一个最佳的权重net_best.pth;②一个记录best results的表格best.csv(根据代码,这个best results指的是这一个epoch的测试集inference后的平均IoU值,即mIoU);③一个vis文件夹(是空的,可以后续自行将测试图像可视化出来,代码里仅新建了这一文件夹,并没有保存内容)。

它这个results文件夹下的子文件夹命名属实有些诡异,我个人建议进行如下修改:将51行open_folder函数下的mkdir改为:import time; os.mkdir(path + '/' + str(time.strftime('%Y-%m-%d-%H'+'h'+'%M'+'m'+'%S'+'s'))),则文件夹将命名为如:2024-02-29-13h53m58s(想法很好,但是改完后代码不通了。先维持原样。)

另外,在训练之前,建议将train的Loss和inference的Dice/mIoU都记录在csv上,或者直接记录在tensorboard的summary里面绘制曲线,这样出点图会很漂亮。

这里的epoch数虽然默认是5000,但是使用vit_b在单卡上用时38分钟训练了11轮,mIoU就已经达到0.8379了,在单卡上用时大约38分钟。用时1小时43分钟训练了29轮,mIoU达到0.8658。

未完待续


 

  • 5
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值