nnunetv2系列:2D实例分割数据集转换

nnunetv2系列:自定义2D实例分割数据集转换

这里主要参考官方源文件nnUNet/nnunetv2/dataset_conversion/Dataset120_RoadSegmentation.py,注释了一些不必要的操作。数据集下载链接: massachusetts-roads-dataset

重要提示:
nnU-Net只能用于使用无损(或无)压缩的文件格式!因为文件格式是为整个数据集定义的(而不是单独为图像和分割定义的,这可能是将来的任务),我们必须确保没有破坏分割映射的压缩工件。所以没有。jpg之类的!
支持的2D数据集文件类型包括.png、.bmp、.tif

原数据集目录结构

这里展示massachusetts-roads-dataset数据集的目录结构。
这里的testing目录在训练过程中并不会使用,默认是在training目录中划分训练集和验证集。

./datasets/road_segmentation_ideal/
├── testing/
│   ├── input/
│   │   ├── img-10.png
│   │   ├── img-11.png
│   │   ├── ...
│   └── output/
│       ├── img-10.png
│       ├── img-11.png
│   │   ├── ...
└── training/
    ├── input/
    │   ├── img-1000.png
    │   ├── img-1001.png
│   │   ├── ...
    └── output
        ├── img-1000.png
        ├── img-1002.png
│   │   ├── ...

转换后数据集目录结构

nnUNet_raw/Dataset120_RoadSegmentation
├── dataset.json
├── imagesTr
│   ├── img-2_0000.png
│   ├── img-7_0000.png
│   ├── ...
├── imagesTs # optional
│   ├── img-1_0000.png
│   ├── img-2_0000.png
│   ├── ...
└── labelsTr
|   ├── img-2.png
|   ├── img-7.png
|   ├── ...
└── labelsTs
|   ├── img-1.png
|   ├── img-2.png
|   ├── ...

转换代码示例

这里展示的是包括背景在内的三类实例分割数据集转换代码。

import multiprocessing
import shutil

from batchgenerators.utilities.file_and_folder_operations import (
    join,
    maybe_mkdir_p,
    subfiles,
)

from nnunetv2.dataset_conversion.generate_dataset_json import (
    generate_dataset_json,
)
from nnunetv2.paths import nnUNet_raw
from skimage import io
# from acvl_utils.morphology.morphology_helper import generic_filter_components
# from scipy.ndimage import binary_fill_holes


def load_and_covnert_case(
    input_image: str,
    input_seg: str,
    output_image: str,
    output_seg: str,
    min_component_size: int = 50,
):
    seg = io.imread(input_seg)
    seg[seg == 128] = 1
    seg[seg == 255] = 2
    # image = io.imread(input_image)
    # image = image.sum(2)
    # mask = image == (3 * 255)
    # # the dataset has large white areas in which road segmentations can exist but no image information is available.
    # # Remove the road label in these areas
    # mask = generic_filter_components(
    #     mask,
    #     filter_fn=lambda ids, sizes: [
    #         i for j, i in enumerate(ids) if sizes[j] > min_component_size
    #     ],
    # )
    # mask = binary_fill_holes(mask)
    # seg[mask] = 0
    io.imsave(output_seg, seg, check_contrast=False)
    shutil.copy(input_image, output_image)


if __name__ == "__main__":
    # extracted archive from https://www.kaggle.com/datasets/insaff/massachusetts-roads-dataset?resource=download
    source = "/home/bio/family/segmenation/nnUNet/datasets/eye_sclera_iris_segmentation"

    dataset_name = "Dataset500_ScleraIrisSegmentation"

    imagestr = join(nnUNet_raw, dataset_name, "imagesTr")
    imagests = join(nnUNet_raw, dataset_name, "imagesTs")
    labelstr = join(nnUNet_raw, dataset_name, "labelsTr")
    labelsts = join(nnUNet_raw, dataset_name, "labelsTs")
    maybe_mkdir_p(imagestr)
    maybe_mkdir_p(imagests)
    maybe_mkdir_p(labelstr)
    maybe_mkdir_p(labelsts)

    train_source = join(source, "training")
    test_source = join(source, "testing")

    with multiprocessing.get_context("spawn").Pool(8) as p:
        # not all training images have a segmentation
        valid_ids = subfiles(
            join(train_source, "output"), join=False, suffix="png"
        )
        num_train = len(valid_ids)
        r = []
        for v in valid_ids:
            r.append(
                p.starmap_async(
                    load_and_covnert_case,
                    (
                        (
                            join(train_source, "input", v),
                            join(train_source, "output", v),
                            join(imagestr, v[:-4] + "_0000.png"),
                            join(labelstr, v),
                            50,
                        ),
                    ),
                )
            )

        # test set
        valid_ids = subfiles(
            join(test_source, "output"), join=False, suffix="png"
        )
        for v in valid_ids:
            r.append(
                p.starmap_async(
                    load_and_covnert_case,
                    (
                        (
                            join(test_source, "input", v),
                            join(test_source, "output", v),
                            join(imagests, v[:-4] + "_0000.png"),
                            join(labelsts, v),
                            50,
                        ),
                    ),
                )
            )
        _ = [i.get() for i in r]

    generate_dataset_json(
        join(nnUNet_raw, dataset_name),
        {0: "R", 1: "G", 2: "B"},
        {"background": 0, "iris": 1, "sclera": 2},
        num_train,
        ".png",
        dataset_name=dataset_name,
    )

  • 25
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值