Github上目前最热门的实时实例分割开源库YOLACT++

最新推荐文章于 2024-06-07 09:56:19 发布

仪器之家

最新推荐文章于 2024-06-07 09:56:19 发布

阅读量2.2k

点赞数

分类专栏：图像算法文章标签：图像算法 YOLACT 图像分割分割模型 YOLACT++

本文链接：https://blog.csdn.net/hahabeibei123456789/article/details/103709301

版权

图像算法专栏收录该内容

154 篇文章 45 订阅

订阅专栏

YOLACT是ICCV2019收到的实时实例分割paper。

YOLACT提出的实时实例分割算法最近被作者扩展为YOLACT++：更好的实时实例分割。它的resnet50模型运行在Titan Xp上。速度达到33.5fps，在COCO的test dev数据集上达到34.1map，而且代码是开源的。

作者来自自加州大学戴维斯分校：

作者提出了一个简单的，全卷积的实时（> 30 fps）实例分割模型，该模型在单个Titan Xp上评估的MS COCO上取得了非常有竞争性结果，其速度明显快于任何现有技术方法。

此外是在一个GPU上训练后即可获得此结果。

作者通过将实例分割分为两个并行的子任务来完成此任务：

（1）生成一组原型蒙版

（2）预测每个实例的蒙版系数。

（3）通过将原型与模板系数线性组合来生成实例模板。

由于此过程不依赖于回收，因此此方法可产生非常高质量的蒙版并免费显示时间稳定性。分析了原型的涌现行为后，并显示了它们是完全卷积的，但学会了以翻译变体的方式自行定位实例。

作者还提出了快速NMS，这是对标准NMS的12毫秒快速替代，仅会影响性能。最后，通过将可变形卷积合并到骨干网络中，使用更好的锚定比例和长宽比优化预测头，并添加新颖的快速蒙版重新评分分支，我们的YOLACT ++模型可以在33.5 fps的MS COCO上实现34.1 mAP，即仍然非常先进，同时仍在实时运行。

下图显示了YOLACT/YOLACT++与其他实例分割算法的速度和精度比较：

由此可见，YOLACT级数具有很大的速度优势，YOLACT++在YOLACT的基础上提高了精度。

以下视频是文章作者在ICCV2019上发布的演示示例：

这些结果不是后处理的，而是在GPU上实时运行的。

YOLACT网络架构：

图2：YOLACT体系结构蓝色/黄色表示原型中的低/高值，灰色节点表示未经训练的功能，在此示例中，k = 4。我们使用ResNet-101 + FPN在RetinaNet [25]基础上建立了该架构。

YOLACT评估结果基于COCO的测试开发集。该基本模型在33.0 fps时达到29.8 mAP。所有图像的置信度阈值均设置为0.3。

与COCO数据集上其他算法的更详细比较结果：

表1：MS COCO结果我们将最先进的掩模mAP方法与COCO test-dev上的速度进行了比较，并包括了一些基本模型的删节，不同的骨干网络和图像大小。我们表示具有网络深度功能的骨干架构，其中R和D分别指ResNet和DarkNet。我们的基本模型，带有ResNet-101的YOLACT-550，比以前的具有竞争性口罩mAP的最快方法快3.9倍。我们的带有ResNet-50的YOLACT ++-550模型具有相同的速度，同时将基本模型的性能提高了4.3 mAP。与Mask R-CNN相比，YOLACT ++-R-50快3.9倍，仅落后1.6 mAP。

YOLACT/YOLACT++实现了最快的速度，同时获得了良好的分割精度。

作者已经开放了几个模型：

我们来看看如何使用：

关于COCO的定量结果

# Quantitatively evaluate a trained model on the entire validation set. Make sure you have COCO downloaded as above.

# This should get 29.92 validation mask mAP last time I checked.

python eval.py --trained_model=weights/yolact_base_54_800000.pth

# Output a COCOEval json to submit to the website or to use the run_coco_eval.py script.

# This command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json

# You can run COCOEval on the files created in the previous command. The performance should match my implementation in eval.py.

python run_coco_eval.py

# To output a coco json file for test-dev, make sure you have test-dev downloaded from above and go

python eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json --dataset=coco2017_testdev_dataset

COCO的定性结果

# Display qualitative results on COCO. From here on I'll use a confidence threshold of 0.15.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --display

COCO基准

# Run just the raw model on the first 1k images of the validation set

python eval.py --trained_model=weights/yolact_base_54_800000.pth --benchmark --max_images=1000

Images

# Display qualitative results on the specified image.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --image=my_image.png

# Process an image and save it to another file.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --image=input_image.png:output_image.png

# Process a whole folder of images.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --images=path/to/input/folder:path/to/output/folder

Video

# Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance.

# If you want, use "--display_fps" to draw the FPS directly on the frame.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=my_video.mp4

# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=0

# Process a video and save it to another file. This uses the same pipeline as the ones above now, so it's fast!

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=input_video.mp4:output_video.mp4

训练

# Trains using the base config with a batch size of 8 (the default).

python train.py --config=yolact_base_config

# Trains yolact_base_config with a batch_size of 5. For the 550px models, 1 batch takes up around 1.5 gigs of VRAM, so specify accordingly.

python train.py --config=yolact_base_config --batch_size=5

# Resume training yolact_base with a specific weight file and start from the iteration specified in the weight file's name.

python train.py --config=yolact_base_config --resume=weights/yolact_base_10_32100.pth --start_iter=-1

# Use the help option to see a description of all available command line arguments

python train.py --help

看下分割结果示例

结论
论文地址或源码下载地址：关注“图像算法”wx公众号回复"YOLACT"，我们提出了第一种竞争性的单阶段实时实例分割方法。关键思想是并行预测掩模原型和每个实例的掩模系数，并将它们线性组合以形成最终的实例掩模。在MS COCO和Pascal VOC上的大量实验证明了我们方法的有效性和每个组成部分的贡献。我们还分析了原型的新兴行为，以解释YOLACT，即使作为FCN，也如何引入翻译差异进行实例分割。最后，通过对骨干网的改进，更好的锚设计和快速的蒙版重新评分网络，我们的YOLACT ++与原始框架相比，显示出了显着的提升，同时仍可以实时运行。

仪器之家

关注

0
点赞
踩
7

收藏

觉得还不错? 一键收藏
1
评论
Github上目前最热门的实时实例分割开源库YOLACT++

YOLACT是ICCV2019收到的实时实例分割paper。YOLACT提出的实时实例分割算法最近被作者扩展为YOLACT++：更好的实时实例分割。它的resnet50模型运行在Titan Xp上。速度达到33.5fps，在COCO的test dev数据集上达到34.1map，而且代码是开源的。作者来自自加州大学戴维斯分校：作者提出了一个简单的，全卷积的实时（> 30 f...
复制链接

扫一扫

专栏目录