将之前标注的目标检测结果导入到label_studio重新标注

内卷焦虑sansan

已于 2024-12-13 16:52:58 修改

阅读量2.6k

点赞数 25

文章标签：目标检测 yolo

于 2024-12-13 16:44:43 首次发布

本文链接：https://blog.csdn.net/qq_64574122/article/details/144454309

版权

新增标签，将之前标注的目标检测结果导入到label_studio重新标注

随着业务扩张，我们的数据标签不断增加。为避免重新标注全部历史数据的繁琐工作，可以将之前已标注的数据转换为模型预测的结果格式，导入 Label Studio 系统。这样，我们只需针对新增的标签进行补充标注，既节省人力，又确保数据完整性。

这里不介绍安装label_studio、启动label_studio。

这篇文章的前提已经安装好了label_studio，并且已经成功启动。

一、创建一个项目

在这里插入图片描述

二、创建从本地路径上传的storage、上传要标注的图片到目录下

在这里插入图片描述
这里的absolute local path就是你这个项目存放的需要标注图片的绝对路径

这是我的路径下的文件信息，我的路径前面多了/opt/docker/是因为我在容器里做了映射。

创建好storage之后，将要标注的图片传输到这个目录下，我这里传输了四张图片。

三、数据格式转换

yolo格式的标注数据

我之前在label_studio中标注的数据的格式是yolo的格式
在这里插入图片描述
每一行表示一个标注的结果：

<class_id> <x_center> <y_center>

<class_id>: 类别 ID，从 0 开始（例如，0 表示第一类，1 表示第二类）。

<x_center>: 标注框中心点的 x 坐标，相对于图像宽度的归一化值，范围为 0 到 1。

<y_center>: 标注框中心点的 y 坐标，相对于图像高度的归一化值，范围为 0 到 1。

<width>: 标注框的宽度，相对于图像宽度的归一化值，范围为 0 到 1。

<height>: 标注框的高度，相对于图像高度的归一化值，范围为 0 到 1。

label_studio支持的导入模型预测结果的数据格式

[{'predictions': [{'model_version': 'one',
    'score': 0.5,
    'result': [{'original_width': 238,
      'original_height': 400,
      'image_rotation': 0,
      'value': {'x': 67.87777351604542,
       'y': 91.97786998616884,
       'width': 8.833506458801875,
       'height': 5.809128630705339,
       'rotation': 0,
       'rectanglelabels': ['CE']},
      'id': '1',
      'from_name': 'label',
      'to_name': 'image',
      'type': 'rectanglelabels',
      'origin': 'manual'}]}],
  'data': {'image': 'https://mm.belccol.com/data/local-files/?d=label-studio/data/media/upload/8/0000001.png'}},
 {'predictions': [{'model_version': 'one',
    'score': 0.5,
    'result': [{'original_width': 1500,
      'original_height': 1125,
      'image_rotation': 0,
      'value': {'x': 15.663900414937757,
       'y': 8.99031811894882,
       'width': 15.14522821576763,
       'height': 7.883817427385886,
       'rotation': 0,
       'rectanglelabels': ['French recycling logo']},
      'id': '2',
      'from_name': 'label',
      'to_name': 'image',
      'type': 'rectanglelabels',
      'origin': 'manual'}]}],
  'data': {'image': 'https://mm.belccol.com/data/local-files/?d=label-studio/data/media/upload/8/01b6725a91146d270f0a0e7c6202f7cf.jpg'}},
 ]

是一个列表格式的数据，每个字典元素表示一张图片的预测结果；predictions：预测的结果、data：文件名（我这里是用的url，https://mm.belccol.com/data/local-files/?d=这串是固定的，后面是图片的路径）。

predictions的值又是一个列表，每个元素表示的是每个标签的标注结果：model_version、score、from_name、to_name、type、origin、rotation、image_rotation这些字段都是可以写死不变的。

id需要不同，且是字符串，不然会有问题。

rectanglelabels里的值是标签的名称，original_width、original_height表示图片的宽和高。value字段中的x、y、width、height四个值表示的是：标注框的左上角x坐标、左上角y坐标、宽、高的值，但这个值是一个百分比数值。所以要将yolo的坐标框转换为label_studio的框的数据格式，可以用下面的代码:

x = (x_center - width / 2) * 100
y = (y_center - height / 2) * 100
width = width * 100
height = height * 100

以下是我生成完整的数据格式的代码：

import json
import os
import glob
from PIL import Image


label_dict = {
    "3+": 0,
    "6+": 1,
    "ASTM": 2,
    "ButtonbatterywarningsignsTwo": 3,
    "Buttonbatterywarningsignsone": 4,
    "CE": 5,
    "CPC": 6,
    "DC": 7,
    "Recycling with code": 8,
    "DE WEEE recycling symbol": 9,
    "EC-REP": 10,
    "FCC": 11,
    "French recycling-Daidian": 12,
    "French recycling logo": 13
}

label_reverse = {index: name for name, index in label_dict.items()}

model_version = "one"
root_path = "https://mm.belccol.com/data/local-files/?d=label-studio/data/media/upload/8/"


result_list = []
images_path = r"D:\work\code\ai_my\label_studio_test\images\*"
labels_path = r"D:\work\code\ai_my\label_studio_test\labels"
id = 1
for file in glob.glob(images_path):
    print("file:", file)
    suffix = file.split("\\")[-1].split(".")[0]
    img_path = file.split("\\")[-1]
    image = Image.open(file)
    width, height = image.size
    labels_file = labels_path +  "/" + suffix + ".txt"
    with open(labels_file, "r") as f:
        lines = f.readlines()
    json_result = {"predictions": [{"model_version": "one",
        "score": 0.5, "result": []}], "data": {"image": root_path + img_path}}
    for line in lines:
        label_index, x, y, w, h = map(float, line.strip().split(" "))
        label = label_reverse[int(label_index)]
        x1 = (x - w / 2)*100
        y1 = (y - h / 2)*100
        w1 = w*100
        h1 = h*100
        
        sub_res = {
            "original_width": width,
            "original_height": height,
            "image_rotation": 0,
            "value": {
              "x": x1,
              "y": y1,
              "width": w1,
              "height": h1,
              "rotation": 0,
              "rectanglelabels": [
                label
              ]
            },
            "id": str(id),
            "from_name": "label",
            "to_name": "image",
            "type": "rectanglelabels",
            "origin": "manual"
          }
        json_result["predictions"][0]["result"].append(sub_res)
        id += 1
    result_list.append(json_result)
        


with open('./label_studio_import.json', 'w', encoding='utf-8') as f:
    json.dump(result_list, f, ensure_ascii=False, indent=4)