【mmsegmentation模型训练deeplabv3】自定义数据集加载和训练|rle编码转mmsegmentation|coco转mmsegmentation

活成自己的样子啊

已于 2022-08-11 22:23:47 修改

阅读量2.5k

点赞数 1

分类专栏：深度学习

于 2022-08-08 20:24:51 首次发布

本文链接：https://blog.csdn.net/m0_61139217/article/details/126228866

版权

深度学习专栏收录该内容

20 篇文章 5 订阅

订阅专栏

前言

mmsegmentation下载：

mmsegmentation官网：欢迎来到 MMSegmentation 的文档! — MMSegmentation 0.27.0 文档https://mmsegmentation.readthedocs.io/zh_CN/latest/环境配置：

mmsegmentation数据集制作

mask图像、二值化图像、mmsegmentation标注文件的区别和联系：

商汤科技的标注文件格式要求：

coco数据集转mmsegmetation：（需要下载pycocotools或者pycocotools-windows）

rle编码转mmsegmentation格式标注：

前言

mmsegmentation下载：

GitHub - open-mmlab/mmsegmentation: OpenMMLab Semantic Segmentation Toolbox and Benchmark.https://github.com/open-mmlab/mmsegmentation

mmsegmentation官网：欢迎来到 MMSegmentation 的文档! — MMSegmentation 0.27.0 文档https://mmsegmentation.readthedocs.io/zh_CN/latest/环境配置：

这里我分别在Windows 10下和Linux（Centos7）下进行了训练和测试。

Windows：

RTX 2080 Ti 1张

CUDA 11.4

torch 1.9.1

torchvision 0.10.1

mmcv-full 1.4.0

Linux：

RTX 2080 Ti 3张

CUDA 10.2

torch 1.7.0

torchvision 0.8.1

mmcv-full 1.6.1

注：mmcv-full是全版，mmcv是精简版。注意mmcv的版本和pytorch、cuda对应。

安装 MMCV — mmcv 1.6.1 文档https://mmcv.readthedocs.io/zh_CN/latest/get_started/installation.html

mmsegmentation数据集制作

mask图像、二值化图像、mmsegmentation标注文件的区别和联系：

•Mask图像：mask是用于部分或完全隐藏对象或元素的部分的图形操作。将mask应用于图形对象的效果就好像将图形对象通过遮罩涂在背景上,从而完全或部分地遮盖了图形对象的各个部分,而遮罩内的图像不变化。

•二值化图像：“图像二值化（ Image Binarization）就是将图像上的像素点的灰度值设置为0或255，也就是将整个图像呈现出明显的黑白效果的过程。

•Mmsegmentation标注文件：泛指商汤科技加载语义分割数据集独有的标注格式。

•Rle编码：“RLE全称（run-length encoding），翻译为游程编码，又译行程长度编码，又称变动长度编码法（run coding），在控制论中对于二值图像而言是一种编码方法，对连续的黑、白像素数(游程)以不同的码字进行编码。

（源自百度百科~~~）

商汤科技的标注文件格式要求：

（1）data文件夹结构：

注：seg_map_suffix是标注图片，img_suffix是原始图片。标注图片为原始图片对应的mask，要求数量、图像的形状和原始图像相同。

（2）标注图像的mask标注注意事项：

注：来自官方网站。

coco数据集转mmsegmetation：（需要下载pycocotools或者pycocotools-windows）

我们使用标准的coco数据集来转换成mmsegmentation的标注类型，他的原理还是将coco类型的标注二值化成rle编码的图像标注，在将rle编码进行解码，得到二值化图像，之后再进行类别关联操作，使其具有多种像素点代表多种类别。

代码：

import os
from pathlib import Path
import logging
import json
logger = logging.getLogger(__name__)

# @logger.catch(reraise=True)
def coco_to_mmsegmentation(
    annotations_file: str, output_annotations_file: str, output_masks_dir: str
):
    """Convert json in [segmentation format](https://gradiant.github.io/ai-dataset-template/supported_tasks/#segmentation) to txt in [mmsegmentation format](https://mmsegmentation.readthedocs.io/en/latest/tutorials/new_dataset.html#reorganize-dataset-to-existing-format).

    Args:
        annotations_file:
            path to json in [segmentation format](https://gradiant.github.io/ai-dataset-template/supported_tasks/#segmentation)
        output_annotations_file:
            path to write the txt in [mmsegmentation format](https://mmsegmentation.readthedocs.io/en/latest/tutorials/customize_datasets.html#customize-datasets-by-reorganizing-data)
        output_masks_dir:
            path where the masks generated from the annotations will be saved to.
            A single `{file_name}.png` mask will be generated for each image.
    """
    import cv2
    import numpy as np
    from pycocotools.coco import COCO

    if not os.path.isdir(output_masks_dir):
        os.mkdir(output_masks_dir)

    Path(output_annotations_file).parent.mkdir(parents=True, exist_ok=True)
    Path(output_masks_dir).mkdir(parents=True, exist_ok=True)

    logger.info(f"Loading annotations form {annotations_file}")
    annotations = json.load(open(annotations_file))

    logger.info(f"Saving annotations to {output_annotations_file}")
    with open(output_annotations_file, "w") as f:
        for image in annotations["images"]:
            # 读图片信息
            filename = Path(image["file_name"]).parent / Path(image["file_name"]).stem
            # 重新保存图片路径到txt
            f.write(str(filename))
            f.write("\n")

    logger.info(f"Saving masks to {output_masks_dir}")
    coco_annotations = COCO(annotations_file)
    for image_id, image_data in coco_annotations.imgs.items():

        filename = image_data["file_name"]

        anns_ids = coco_annotations.getAnnIds(imgIds=image_id)  # 一个图片会对应多个标注
        image_annotations = coco_annotations.loadAnns(anns_ids)  # 加载这个图片所有的标注

        logger.info(f"Creating output mask for {filename}")
        # 000纯黑色  这里是一通道矩阵，（二维矩阵）
        output_mask = np.zeros(
            (image_data["height"], image_data["width"]), dtype=np.uint8
        )
        # 找到这个图片对应的所有标注框
        for image_annotation in image_annotations:
            # print(type(image_annotation))  # dict
            # print(image_annotation.keys())  # dict_keys(['id', 'image_id', 'category_id', 'iscrowd', 'area', 'bbox', 'segmentation', 'width', 'height'])
            category_id = image_annotation["category_id"]  # 每一个标注对应的类别id
            # print(type(category_id))
            try:
                category_mask = coco_annotations.annToMask(image_annotation)
                # print(category_mask)
            except Exception as e:
                logger.warning(e)
                logger.warning(f"Skipping {image_annotation}")
                print('出错啦！！！---------------------------------')
                continue
            category_mask *= category_id  # mask值乘以id：id从1开始 --》 这里的框必须是同一个类别
            category_mask *= output_mask == 0  # 如果output_mask其中一个像素值是0，那么保持不动，如果不是0，则清零
            output_mask += category_mask  # 标注合并

        output_filename = Path(output_masks_dir) / Path(filename).with_suffix(".png")
        output_filename.parent.mkdir(parents=True, exist_ok=True)

        logger.info(f"Writting mask to {output_filename}")
        cv2.imwrite(str(output_filename), output_mask)

if __name__ == "__main__":
    coco_to_mmsegmentation(r"steel_coco.json", 'steel_coco.txt', 'mask_ann')

注：coco转 mmsegmentation类型标注的原理还是利用rle编码进行转换，如果已经获得了图像对应的二值化编码标注，也可以直接进行转换。

rle编码转mmsegmentation格式标注：

这里我们不使用pycocotools第三方库工具，从rle编码格式上进行操作。

代码：

import os

import cv2
import numpy as np


def rle_decode(mask_rle: str = '', shape: tuple = (1400, 2100)):
    '''
    Decode rle encoded mask.

    :param mask_rle: run-length as string formatted (start length)
    :param shape: (height, width) of array to return
    Returns numpy array, 1 - mask, 0 - background
    '''
    s = mask_rle.split()  # 这个运算前后没啥区别
    # print("-----------------------------------------------------------")
    # print("s[0:][::2]=", s[0:][::2])  # 这个获取的是变化的像素的位置序号的列表
    # # ['1', '13']
    # print("s[1:][::2]=", s[1:][::2])  # 这个获取的是相同像素的长度列表（分别记录每个变化的像素后面连续的同等像素值的连续长度）
    # # ['2', '2']

    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    # print("看下最初的starts=", starts)  # 变化的像素的位置序号的列表
    # print("lengths=", lengths)
    starts -= 1
    ends = starts + lengths
    # print("ends=", ends)
    img = np.zeros(shape[0] * shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):  # 进行恢复
        img[lo:hi] = 1
    return img.reshape(shape, order='F')


if __name__ == '__main__':
    import csv
    from PIL import Image

    if not os.path.isdir('mask_ann3'):
        os.mkdir('mask_ann3')

    with open(r'train.csv') as fp:
        fcsv = csv.reader(fp)

        file_mm_dict = {}
        for step, each_line in enumerate(fcsv):
            if step == 0:
                continue

            category_id = each_line[1]
            category_id = int(category_id)
            img_name = os.path.join('mask_ann', each_line[0][:-4] + '.png')
            image = Image.open(img_name)

            if each_line[0][:-4] not in file_mm_dict.keys():
                file_mm_dict[each_line[0][:-4]] = np.zeros((image.height, image.width), dtype=np.uint8)

            # print(image.size)
            result = rle_decode(each_line[-1], (image.height, image.width))
            # print(np.max(result))
            result *= category_id * 50 # mask值乘以id：id从1开始 --》 这里的框必须是同一个类别
            result *= file_mm_dict[each_line[0][:-4]] == 0  # 如果output_mask其中一个像素值是0，那么保持不动，如果不是0，则清零
            # print(np.max(file_mm_dict[each_line[0][:-4]]))
            file_mm_dict[each_line[0][:-4]] += result  # 标注合并
            # print(np.max(file_mm_dict[each_line[0][:-4]]))

        for key in file_mm_dict.keys():
            img = Image.fromarray(file_mm_dict[key])
            cv2.imwrite(r'mask_ann3/' + key + '.png', file_mm_dict[key])
            print(key + '.png' + '完成转换！！！')

模型训练

config文件命名规则：

单机单卡训练：

我们直接使用最普通的命令即可！

 python .\tools\train.py configs/deeplabv3/deeplabv3_r50-d8_512x512_4x4_80k_coco-stuff164k.py --work-dir test_80k --gpus 1

单机多卡训练：

（1）Windows要使用 -m torch.distributed.launch 进行启动，并且配置--launch和--gpu-id参数。

（2）Linux使用dist_train.sh 进行训练。

多机多卡训练：

Linux使用slurm_train.sh 进行训练。

注：由于我再进行多卡训练的时候出现了负载不均衡的情况，故在此不做详细说明。

详见：训练一个模型 — MMSegmentation 0.27.0 文档

活成自己的样子啊

关注

1
点赞
踩
6

收藏

觉得还不错? 一键收藏
打赏
4
评论
【mmsegmentation模型训练deeplabv3】自定义数据集加载和训练|rle编码转mmsegmentation|coco转mmsegmentation

【mmsegmentation模型训练deeplabv3】自定义数据集加载和训练|rle编码转mmsegmentation|coco转mmsegmentation
复制链接

扫一扫