一文讲清nnUNetv1在自己数据集上训练

Apple U~

已于 2024-12-16 21:08:47 修改

阅读量1.8k

点赞数 29

分类专栏： nnUNet训练常见网络训练私有任务文章标签： python 深度学习人工智能

于 2024-08-29 10:55:23 首次发布

本文链接：https://blog.csdn.net/chen_niansan/article/details/141527340

版权

nnUNet训练同时被 2 个专栏收录

4 篇文章

订阅专栏

常见网络训练私有任务

3 篇文章

订阅专栏

本文只讲述nnunetv1的在2D图像的上复现步骤，对于实现细节可以阅读原文和代码！

paper:

https://www.nature.com/articles/s41592-020-01008-z

github:

https://github.com/MIC-DKFZ/nnUNet

复现步骤：

1.下载数据集并安装依赖环境：

git clone https://github.com/MIC-DKFZ/nnUNet.git  # 下载代码
cd nnUNet  # 切换目录
conda create -n myenv python=3.9  # 注意nnUNetv2需要python>=3.9
conda activate myenv
pip install nnunet
pip install -e .  #最后这个点也不能忽略

2. 在nnUNet目录下创建文件夹nnUNetFrame，文件夹结构如下：

在这里插入图片描述

3. 创建文件

切换到nnUNetFrame文件夹中创建DATASET文件夹，并在DATASET文件夹下创建nnUNet_preprocessed，nnUNet_raw， nnUNet_trained_models文件夹，在文件夹nnUNet_raw，创建nnUNet_cropped_data文件夹和nnUNet_raw_data文件夹，文件夹结构如下：
在这里插入图片描述

4. 以linux系统为例，找到.bashrc文件，在末尾添加nnUNet_preprocessed，nnUNet_raw， nnUNet_trained_models的路径，格式如下：

注意：'../'需要替换为本地路径！！！
export nnUNet_raw_data_base="../nnUNet/nnUNetFrame/DATASET/nnUNet_raw"
export nnUNet_preprocessed="../nnUNet/nnUNetFrame/DATASET/nnUNet_preprocessed"
export RESULTS_FOLDER="../nnUNet/nnUNetFrame/DATASET/nnUNet_trained_models"

然后关闭.bashrc文件，并在.bashrc文件所在文件目录下运行：

source .bashrc

5. 将数据转为nii.gz格式，并生成对应的dataset.json文件：

（1）将原始数据按照如下格式设置：
在这里插入图片描述
training为训练集，，testing为测试集。input放置图片，output放置标签。
（2）在nnUNet_raw_data文件夹下创建新的文件夹命名为：Task01_XXX. 01可以修改为任意数字，XXX是任务名，根据自己的任务命名即可；

（3）转换代码如下：

import numpy as np
from batchgenerators.utilities.file_and_folder_operations import *
from nnunet.dataset_conversion.utils import generate_dataset_json
from nnunet.paths import nnUNet_raw_data, preprocessing_output_dir
from nnunet.utilities.file_conversions import convert_2d_image_to_nifti

if __name__ == '__main__':
    nnUNet_raw_data = 替换为nnUNet_raw_data路径 # 备选
    """
    nnU-Net was originally built for 3D images. It is also strongest when applied to 3D segmentation problems because a 
    large proportion of its design choices were built with 3D in mind. Also note that many 2D segmentation problems, 
    especially in the non-biomedical domain, may benefit from pretrained network architectures which nnU-Net does not
    support.
    Still, there is certainly a need for an out of the box segmentation solution for 2D segmentation problems. And 
    also on 2D segmentation tasks nnU-Net cam perform extremely well! We have, for example, won a 2D task in the cell 
    tracking challenge with nnU-Net (see our Nature Methods paper) and we have also successfully applied nnU-Net to 
    histopathological segmentation problems. 
    Working with 2D data in nnU-Net requires a small workaround in the creation of the dataset. Essentially, all images 
    must be converted to pseudo 3D images (so an image with shape (X, Y) needs to be converted to an image with shape 
    (1, X, Y). The resulting image must be saved in nifti format. Hereby it is important to set the spacing of the 
    first axis (the one with shape 1) to a value larger than the others. If you are working with niftis anyways, then 
    doing this should be easy for you. This example here is intended for demonstrating how nnU-Net can be used with 
    'regular' 2D images. We selected the massachusetts road segmentation dataset for this because it can be obtained 
    easily, it comes with a good amount of training cases but is still not too large to be difficult to handle.
    """

    # download dataset from https://www.kaggle.com/insaff/massachusetts-roads-dataset
    # extract the zip file, then set the following path according to your system:
    base = 本地数据路径，training的上一级目录
    # this folder should have the training and testing subfolders

    # now start the conversion to nnU-Net:
    task_name = 'Task01_XXX'  # 任务名，自定义
    target_base = join(nnUNet_raw_data, task_name)
    target_imagesTr = join(target_base, "imagesTr")
    target_imagesTs = join(target_base, "imagesTs")
    target_labelsTs = join(target_base, "labelsTs")
    target_labelsTr = join(target_base, "labelsTr")

    maybe_mkdir_p(target_imagesTr)
    maybe_mkdir_p(target_labelsTs)
    maybe_mkdir_p(target_imagesTs)
    maybe_mkdir_p(target_labelsTr)

    # convert the training examples. Not all training images have labels, so we just take the cases for which there are
    # labels
    labels_dir_tr = join(base, 'training', 'output')
    images_dir_tr = join(base, 'training', 'input')
    training_cases = subfiles(labels_dir_tr, suffix='.png', join=False)
    for t in training_cases:
        unique_name = t[:-4]  # just the filename with the extension cropped away, so img-2.png becomes img-2 as unique_name
        input_segmentation_file = join(labels_dir_tr, t)
        input_image_file = join(images_dir_tr, t)

        output_image_file = join(target_imagesTr, unique_name)  # do not specify a file ending! This will be done for you
        output_seg_file = join(target_labelsTr, unique_name)  # do not specify a file ending! This will be done for you

        # this utility will convert 2d images that can be read by skimage.io.imread to nifti. You don't need to do anything.
        # if this throws an error for your images, please just look at the code for this function and adapt it to your needs
        convert_2d_image_to_nifti(input_image_file, output_image_file, is_seg=False)

        # the labels are stored as 0: background, 255: road. We need to convert the 255 to 1 because nnU-Net expects
        # the labels to be consecutive integers. This can be achieved with setting a transform
        convert_2d_image_to_nifti(input_segmentation_file, output_seg_file, is_seg=True,
                                  transform=lambda x: (x == 255).astype(int))

    # now do the same for the test set
    labels_dir_ts = join(base, 'testing', 'output')
    images_dir_ts = join(base, 'testing', 'input')
    testing_cases = subfiles(labels_dir_ts, suffix='.png', join=False)
    for ts in testing_cases:
        unique_name = ts[:-4]
        input_segmentation_file = join(labels_dir_ts, ts)
        input_image_file = join(images_dir_ts, ts)

        output_image_file = join(target_imagesTs, unique_name)
        output_seg_file = join(target_labelsTs, unique_name)

        convert_2d_image_to_nifti(input_image_file, output_image_file, is_seg=False)
        convert_2d_image_to_nifti(input_segmentation_file, output_seg_file, is_seg=True,
                                  transform=lambda x: (x == 255).astype(int))

    # finally we can call the utility for generating a dataset.json
    generate_dataset_json(join(target_base, 'dataset.json'), target_imagesTr, target_imagesTs, ('Red', 'Green', 'Blue'),
                          labels={0: 'background', 1: 'class1', 2: 'class2'}, dataset_name=task_name, license='hands off!')

    """
    once this is completed, you can use the dataset like any other nnU-Net dataset. Note that since this is a 2D
    dataset there is no need to run preprocessing for 3D U-Nets. You should therefore run the 
    `nnUNet_plan_and_preprocess` command like this:
    
    > nnUNet_plan_and_preprocess -t 120 -pl3d None
    
    once that is completed, you can run the trainings as follows:
    > nnUNet_train 2d nnUNetTrainerV2 120 FOLD
    
    (where fold is again 0, 1, 2, 3 and 4 - 5-fold cross validation)
    
    there is no need to run nnUNet_find_best_configuration because there is only one model to choose from.
    Note that without running nnUNet_find_best_configuration, nnU-Net will not have determined a postprocessing
    for the whole cross-validation. Spoiler: it will determine not to run postprocessing anyways. If you are using
    a different 2D dataset, you can make nnU-Net determine the postprocessing by using the
    `nnUNet_determine_postprocessing` command
    """

6. 执行数据转换：

nnUNet_convert_decathlon_task -i Task01_XXX的绝对路径

执行完之后会在Task01_XXX同级目录下生成一个文件夹命名为Task001_XXX，示例如下图所示：
在这里插入图片描述
注：如果不清楚原始的文件格式，可以在转换完之后附转为png检查一下是否正确。防止出现了不正确文件导致后续运行报错！

7. 数据预处理

nnUNet_plan_and_preprocess -t 1 --verify_dataset_integrity
“1 表示任务代号，即Task001”

运行该命令之后会在nnUNet_cropped_data文件中生成命名为Task001_XXX的文件，目录结构如图：
在这里插入图片描述

8.训练命令,按顺序运行

CUDA_VISIBLE_DEVICES=1 nnUNet_train 2d nnUNetTrainerV2 Task001_XXX 0  --npz
CUDA_VISIBLE_DEVICES=1 nnUNet_train 2d nnUNetTrainerV2 Task001_XXX 1  --npz
CUDA_VISIBLE_DEVICES=1 nnUNet_train 2d nnUNetTrainerV2 Task001_XXX 2  --npz
CUDA_VISIBLE_DEVICES=1 nnUNet_train 2d nnUNetTrainerV2 Task001_XXX 3  --npz
CUDA_VISIBLE_DEVICES=1 nnUNet_train 2d nnUNetTrainerV2 Task001_XXX 4  --npz

‘CUDA_VISIBLE_DEVICES=1’ 表示指定GPU训练
‘2d’ 是选用2D Unet模型
‘Task001_XXX’ 表示任务编码，Task001_XXX
‘0,1,2,3,4’ 代表五折交叉验证

9.测试模型：

运行完成五折交叉验证之后可以确定最佳的模型，使用下面的命令进行测试：

nnUNet_find_best_configuration -m 2d -t 001 –strict
# 001是任务编号

然后会在 nnUNet_trained_models/nnUNet/ensembles/Task001_XXX下生成如下如所示文件：
在这里插入图片描述
txt文件中有预测的命令：

nnUNet_predict -i FOLDER_WITH_TEST_CASES -o OUTPUT_FOLDER_MODEL1 -tr nnUNetTrainerV2 -ctr nnUNetTrainerV2CascadeFullRes -m 2d -p nnUNetPlansv2.1 -t Task001_XXX
# FOLDER_WITH_TEST_CASES 输入文件路径
# OUTPUT_FOLDER_MODEL1  输出文件路径
# Task001_XXX  预测任务名

将上述参数修改为自己的任务，然后运行即可

10. 在输出的路径下会保存预测结果文件，格式为nii.gz格式，需要将其转为png格式，代码如下：

import os
import nibabel as nib
import numpy as np
from PIL import Image

def convert_nii_to_png(input_folder, output_folder):
    # 确保输出文件夹存在
    os.makedirs(output_folder, exist_ok=True)

    # 遍历输入文件夹中的所有文件
    for filename in os.listdir(input_folder):
        if filename.endswith('.nii.gz'):
            # 构建完整的文件路径
            file_path = os.path.join(input_folder, filename)
            # 读取 NIfTI 文件
            nii_image = nib.load(file_path)
            image_data = nii_image.get_fdata()

            # 选择中间的切片
            slice_idx = image_data.shape[2] // 2
            slice_data = image_data[:, :, slice_idx]

            # 转换为8位图像格式
            slice_normalized = (slice_data - np.min(slice_data)) / (np.max(slice_data) - np.min(slice_data))
            image_8bit = (slice_normalized * 255).astype(np.uint8)
            image = Image.fromarray(image_8bit)

            # 保存图像
            output_filename = filename.replace('.nii.gz', '.png')
            image.save(os.path.join(output_folder, output_filename))
            print(f"Converted {filename} to {output_filename}")
input_folder = ''  # nii.gz文件路径
output_folder = ''  # png文件路径
convert_nii_to_png(input_folder, output_folder)

11. 完成！

参考资料：

https://blog.csdn.net/weixin_41693877/article/details/121333947
https://blog.csdn.net/kenzsoft/article/details/138548420?utm_medium=distribute.pc_relevant.none-task-blog-2_defaultbaidujs_baidulandingword~default-0-138548420-blog-128886376.235^v43pc_blog_bottom_relevance_base8&spm=1001.2101.3001.4242.1&utm_relevant_index=1
https://blog.csdn.net/m0_68239345/article/details/128886376