深度学习，先学会制作自己的数据集——labelme的用法

最新推荐文章于 2024-09-11 11:16:18 发布

青鸠

最新推荐文章于 2024-09-11 11:16:18 发布

阅读量3.9k

点赞数 6

分类专栏：深度学习之数据标注工具文章标签：深度学习

本文链接：https://blog.csdn.net/qq_35793394/article/details/106784122

版权

深度学习之数据标注工具专栏收录该内容

1 篇文章 0 订阅

订阅专栏

深度学习，先学会制作自己的数据集——labelme的用法

labelme是一个功能强大，开源且跨平台的数据标注工具，可以同时满足图像分割和目标检测的数据集制作，因此在这里大概介绍下，点此查看。先声明，本人才疏学浅，大神请自动绕行。另外，有些小伙伴们可能发现有好多人使用labeimg，其实它和labelme差不多，但只能满足目标检测，这里就不做介绍了，有需要了解的小伙伴可以去github看看，labelimg
1、 labelme的安装

鉴于Anaconda的强大和简便，在安装tensorflow和pytorch时都省时省力，因此先介绍一下在Anaconda中安装labelme。Anaconda的安装请参考此博文
安装完Anaconda后，在开始–》Anaconda Prompt，打开后如图：
在这里插入图片描述
接着开始创建虚拟Python环境：
conda create -n labelme python=3.6
这里的-n 后面的labelme就是创建python虚拟环境名称了，后面的python=3.6就是指定Python版本啦，建议使用3.6及以上版本。
创建好虚拟环境后切记切记要激活环境，即输入命令
conda activate labelme
在这里插入图片描述
注意看看激活后最前面有一个[ XXX]，有这个才算是激活，我前面用labelme命名的虚拟环境，因此这里就显示[labelme].
接下来还要安装一个labelme的依赖项 pyqt
命令如下
conda install pyqt
终于到了真正安装labelme的时候了，
输入以下命令
pip install labelme
这时候可能与很慢很慢，可以采用清华或豆瓣的源
命令变为
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple labelme
遇到选择yes和No的，直接输入yes或者y即可。
安装完成后，labelme的目录就在Anaconda-》envs-》labelme中。

2、labelme的使用
按照上述步骤安装完labelme后，重新打开Anaconda Prompt，输入 activate labelme,处于激活状态，输入labelme就可以打开界面了。
在这里插入图片描述
最左侧的菜单栏大家应该都能看懂，点开试试就知道了，需要注意的是opendir是打开你自己图片所在的文件夹，输出到哪里去指定呢？这时点击file-》change output Dir 点开到你指定的路径即可。

3、后处理
网上前两步都很多，但是由于不同模型对数据集的格式要求不尽相同，因此在标记结束后很多小伙伴会遇到困惑，其实labelme自己就已经提供了各种功能，只是有好多初学的小伙伴不知道而已，百度之后看到各种脚本满天飞，不知道那个适合自己。。。
3.1、转换voc格式
在labelme的安装环境中，Anaconda-》envs-》labelme中，
切换到labelme/examples/semantic_segmentation/目录下，然后执行

./labelme2voc.py labels.txt data_annotated data_dataset_voc

其中label.txt为相应的标签文件，data_annotated为之前标注好的图像数据和相应的json文件，data_dataset_voc为你保存的目录，注意空格分隔。
其中你的label文件是你第一步标注的时候的标签，
记得在标签的头两行一定要加上
ignore
background
不然的话，执行的过程中会报错。
3.2、批量转化成png文件
在Anaconda3\envs\labelme\Lib\site-packages\labelme\cli路径下，有一个json_to_dataset.py文件，此脚本只能转换一个json文件为mask图片,其实只需要稍作修改就可以批量处理，完整代码如下：

import argparse
import base64
import json
import os
import os.path as osp
import cv2

import imgviz
import PIL.Image

from labelme.logger import logger
from labelme import utils


def change_one_json(json_file, output_file):
    # logger.warning('This script is aimed to demonstrate how to convert the '
    #                'JSON file to a single image dataset.')
    # logger.warning("It won't handle multiple JSON files to generate a "
    #                "real-use dataset.")
    #
    # parser = argparse.ArgumentParser()
    # parser.add_argument('json_file')
    # parser.add_argument('-o', '--out', default=None)
    # args = parser.parse_args()

    # json_file = args.json_file

    if output_file is None:
        out_dir = osp.basename(json_file).replace('.', '_')
        out_dir = osp.join(osp.dirname(json_file), out_dir)
    else:
        out_dir = output_file.replace(".jpg", '')
    # if not osp.exists(out_dir):
    #     os.mkdir(out_dir)

    data = json.load(open(json_file))
    imageData = data.get('imageData')

    if not imageData:
        imagePath = os.path.join(os.path.dirname(json_file), data['imagePath'])
        with open(imagePath, 'rb') as f:
            imageData = f.read()
            imageData = base64.b64encode(imageData).decode('utf-8')
    img = utils.img_b64_to_arr(imageData)

    label_name_to_value = {'_background_': 0, 'road': 1, 'stair':0, 'corrider':2, 'gate':0,'rubblish':3}
    for shape in sorted(data['shapes'], key=lambda x: x['label']):
        label_name = shape['label']
        if label_name in label_name_to_value:
            label_value = label_name_to_value[label_name]
        else:
            label_value = len(label_name_to_value)
            label_name_to_value[label_name] = label_value
    lbl, _ = utils.shapes_to_label(
        img.shape, data['shapes'], label_name_to_value
    )

    label_names = [None] * (max(label_name_to_value.values()) + 1)
    for name, value in label_name_to_value.items():
        label_names[value] = name

    lbl_viz = imgviz.label2rgb(
        label=lbl, img=imgviz.asgray(img), label_names=label_names, loc='rb'
    )

    # PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
    # utils.lblsave(osp.join(out_dir, 'label.png'), lbl)
    # PIL.Image.fromarray(img).save(out_dir.replace('gt', 'img') + '_img.png')
    # utils.lblsave(out_dir + '_label.png', lbl)
    cv2.imwrite(out_dir + '_label.png', lbl)
    # PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))

    with open(out_dir.replace('gt', 'label_names') + '_label_names.txt', 'w') as f:
        for lbl_name in label_names:
            f.write(lbl_name + '\n')

    logger.info('Saved to: {}'.format(out_dir))

if __name__ == "__main__":
    import shutil
    json_list = os.listdir('.')
    create_list = ['transpic/gtpic/', 'transpic/img/', 'transpic/label_names/']
    for item in create_list:
        if os.path.exists(item):
            shutil.rmtree(item)
        os.makedirs(item)
    for json_file in json_list:
        if json_file.endswith(".json"):
            output_file = 'transpic/gtpic/' + json_file.replace('json', 'jpg')
            change_one_json(json_file, output_file)