[YOLOv8] - 使用LabelStudio对数据集标注(包括安装和使用技巧)

老狼IT工作室

已于 2023-12-16 12:26:07 修改

阅读量6.6k

点赞数 23

分类专栏： YOLO 文章标签： YOLO LabelStudio 数据标注

于 2023-12-11 16:47:38 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/u011775793/article/details/134921983

版权

YOLO 专栏收录该内容

18 篇文章

订阅专栏

什么是LabelStudio？

GitHub - HumanSignal/label-studio: Label Studio is a multi-type data labeling and annotation tool with standardized output format

label-studio · PyPI

Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models.

这个发展势头比较猛，都支持到最新版本3.12了，有看头。

Python下载 | Python中文网官网

Label Studio Documentation — Data Labeling

Label Studio is an open source data labeling tool that supports multiple projects, users, and data types in one platform. It allows you to do the following:

Perform different types of labeling with many data formats.
Integrate Label Studio with machine learning models to supply predictions for labels (pre-labels), or perform continuous active learning. See Set up machine learning with your labeling process.

Label Studio is also available an Enterprise cloud service with enhanced security (SSO, RBAC, SOC2), team management features, data discovery, analytics and reporting, and support SLAs. A free trial is available to get started quickly and explore the enterprise cloud product.

社区版和企业版的特性比较：

Label Studio Documentation — Label Studio Community and Enterprise Features

Windows下如何安装LabelStudio社区版?

label-studio · PyPI

GitHub - HumanSignal/label-studio: Label Studio is a multi-type data labeling and annotation tool with standardized output format

创建python虚拟环境

安装LabelStudio

启动LabelStudio

通过浏览器中输入：http://localhost:8080/

如果没有账号，切换到“SIGN UP”，创建一个就好。

使用LabelStudio管理标注

点击“Create Project”创建一个项目

导入图像数据集

对已导入的图像标注

进行标签配置

我这里是要做目标检测，所以选择“Computer Vision -> Object Detection with Bounding Boxes”

删除模板的标签，然后定义自己的标签名

回到项目目录

选中要进行标注的图像列表

选中下方的标签名，然后在图中进行标注，标注完，“Submit”提交就好，也可以点击“Skip”忽略当前图像。

标签完回到项目目录，可以查看图像的标注的状态，包括是谁进行标注的，完成标注的时间等

也可以选中已经进行标注过的图像，进行修改

导出图像标注数据

可以选中已经标注完的图像标注进行导出

我这里要用YOLOv8进行训练，就选YOLO就好，点击“Export”

images：被标注的图像

labels：图像的标注信息

classes.txt: label的类别

notes.json: 类别的id和名称的对应关系

使用预训练好的模型进行自动标注

进入设置

https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

从浏览器是可以进行下载yolov8n.pt，但这里提示找不到。从日志看，应该是需要设置一个本地化的ML Backend才行，URL的地方不是填写模型的URL，而是本地化Backend URL的地址。

这里是怎么搭建本地ML backend的说明文档：

Label Studio Documentation — Integrate Label Studio into your machine learning pipeline

搭建本地化ML Backend

现在还不需要，后续需要了再花时间整一个。

LabelStudio使用感受

好的地方

1. LabelStudio使用B/S架构，需要创建账号，方便多人协作对同一个数据集同时进行标注，也方便跟踪谁对哪个图像进行标注；

2. 对数据集按照项目的形式进行管理；

2. 可以导出比较多的数据集格式，比如YOLO，Coco，Pascal VOC XML等。

4. 通过搭建ML Backend，可以很好的把图像标注，以及模型训练进行紧密协调。

不好的地方

标注体验不如本地化的AnyLabeling，LabelMe等，安装也比较麻烦一些。

总结

如果是仅仅是个人使用，或者几个人的小团队，还是算了；用AnyLabeling好些，如果是企业需要做大量标注的研发部门使用，涉及到团队协作，标注质量跟踪等问题，可以考虑使用LabelStudio企业版。

LabelStudio命令行使用

暂时不考虑使用LabelStudio来做标注管理啦，就先考虑命令行的具体使用啦。

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

打赏作者

老狼IT工作室 你的鼓励将是我创作的最大动力。

¥1 ¥2 ¥4 ¥6 ¥10 ¥20

扫码支付：¥1

获取中

扫码支付

您的余额不足，请更换扫码支付或充值

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。