nnUNet ubuntu环境配置用2D图像训练

Shuai@

已于 2022-07-29 12:16:08 修改

阅读量1.3k

点赞数

分类专栏： pytorch deeplearning python 文章标签： ubuntu 深度学习 pytorch

于 2022-07-26 14:59:59 首次发布

本文链接：https://blog.csdn.net/weixin_37707670/article/details/125990327

版权

python 同时被 3 个专栏收录

41 篇文章 0 订阅

订阅专栏

pytorch

23 篇文章 2 订阅

订阅专栏

deeplearning

21 篇文章 0 订阅

订阅专栏

nnUnet说明链接
 保姆级教程：nnUnet在2维图像的训练和测试
 不用写代码神器！教你用4行命令轻松使用nnUNet训练自己的医学图像分割模型

安装和配置nnUNet环境

创建python虚拟环境

首先创建一个python 环境(3.7)，命名为nnunet

conda create -n nnunet python=3.7

然后安装pytorch环境，推荐安装最新的
pytorch的官网链接 https://pytorch.org/
在这里插入图片描述

安装Pytorch

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

如下安装完毕
在这里插入图片描述

安装nnunet

包含两种安装方式，第一是git下载库然后进行本地安装，第二是使用pip 安装，
第一种方式

git clone https://github.com/MIC-DKFZ/nnUNet.git
cd nnUNet
pip install -e .

在这里插入图片描述
第二种方式

pip install nnunet

到此nnUnet已经安装完成了！！

nnUNet的路径设置Setting up Paths

nnU-Net 依靠环境变量来了解原始数据、预处理数据和训练模型权重的存储位置。要使用 nnU-Net 的全部功能，必须设置以下三个环境变量：

nnUNet_raw_data_base：这是 nnU-Net 找到原始数据并存储裁剪数据的地方。位于 nnUNet_raw_data_base 的文件夹必须至少具有子文件夹 nnUNet_raw_data，而该子文件夹又包含每个任务的一个子文件夹。用户有责任将原始数据转换为适当的格式 - nnU-Net 将负责其余的工作；-) 有关所需原始数据格式的更多信息，请参见此处。

示例树结构：

nnUNet_raw_data_base/nnUNet_raw_data/Task002_Heart
├── dataset.json
├── imagesTr
│   ├── la_003_0000.nii.gz
│   ├── la_004_0000.nii.gz
│   ├── ...
├── imagesTs
│   ├── la_001_0000.nii.gz
│   ├── la_002_0000.nii.gz
│   ├── ...
└── labelsTr
    ├── la_003.nii.gz
    ├── la_004.nii.gz
    ├── ...
nnUNet_raw_data_base/nnUNet_raw_data/Task005_Prostate/
├── dataset.json
├── imagesTr
│   ├── prostate_00_0000.nii.gz
│   ├── prostate_00_0001.nii.gz
│   ├── ...
├── imagesTs
│   ├── prostate_03_0000.nii.gz
│   ├── prostate_03_0001.nii.gz
│   ├── ...
└── labelsTr
    ├── prostate_00.nii.gz
    ├── prostate_01.nii.gz
    ├── ...

nnUNet_preprocessed：这是保存预处理数据的文件夹。训练期间也将从该文件夹中读取数据。
因此，重要的是它位于具有低访问延迟和高吞吐量的驱动器上（常规 sata 或 nvme SSD 就足够了）。
RESULTS_FOLDER：这指定了 nnU-Net 将保存模型权重的位置。如果下载了预训练模型，它将在此处保存它们。

如何设置环境变量

（nnU-Net 是为 Ubuntu/Linux 开发的。以下指南适用于该操作系统，不适用于其他操作系统。我们不提供对其他操作系统的支持！）

有几种方法可以做到这一点。最常见的一种是在 .bashrc 文件中设置路径，该文件位于您的主目录中。对我来说，这个文件位于 /home/fabian/.bashrc。您可以使用任何选择的文本编辑器打开它。如果您没有看到该文件，那可能是因为默认情况下它是隐藏的。您可以运行 ls -al /home/fabian 以确保您看到它。在极少数情况下它可能不存在，您可以简单地使用 touch /home/fabian/.bashrc 创建它。

在文本编辑器中打开文件后，将以下行添加到底部：

export nnUNet_raw_data_base="/media/fabian/nnUNet_raw_data_base"
export nnUNet_preprocessed="/media/fabian/nnUNet_preprocessed"
export RESULTS_FOLDER="/media/fabian/nnUNet_trained_models"

（当然要根据您的系统调整路径，并记住 nnUNet_preprocessed 应该位于 SSD 上！）

然后保存退出。要保存，请确保通过运行 source /home/fabian/.bashrc 重新加载 .bashrc。仅需要在保存更改之前已经打开的终端会话上进行重新加载。您打开的任何新终端都将设置这些路径。您可以通过键入 echo $RESULTS_FOLDER 等来验证路径是否设置正确，它应该打印出正确的文件夹。

设置这些路径的另一种方法

上述方法在您的系统上永久设置路径（直到您从 .bashrc 中删除这些行）。如果您只想临时设置它们，可以在终端中运行导出命令：

export nnUNet_raw_data_base="/media/fabian/nnUNet_raw_data_base"
export nnUNet_preprocessed="/media/fabian/nnUNet_preprocessed"
export RESULTS_FOLDER="/media/fabian/nnUNet_trained_models"

这将仅设置当前终端会话的路径（如果您关闭终端，变量将丢失并且每次都需要重置）。

数据准备

将我们的2D数据转化成nnUNet需要的3D数据,这个参考nnUnet中提供的Task120将2D数据转化成3D数据的例子。2D转化成3D的例子

import numpy as np
from batchgenerators.utilities.file_and_folder_operations import *
from nnunet.dataset_conversion.utils import generate_dataset_json
from nnunet.paths import nnUNet_raw_data, preprocessing_output_dir
from nnunet.utilities.file_conversions import convert_2d_image_to_nifti


# def transform(x):
#     x[x==255] = 0
#     x[x==160] = 1
#     x[x==80]  = 2
#     x[x==0]   = 3
#     return x.astype(int)

if __name__ == '__main__':
    """
    nnU-Net was originally built for 3D images. It is also strongest when applied to 3D segmentation problems because a 
    large proportion of its design choices were built with 3D in mind. Also note that many 2D segmentation problems, 
    especially in the non-biomedical domain, may benefit from pretrained network architectures which nnU-Net does not
    support.
    Still, there is certainly a need for an out of the box segmentation solution for 2D segmentation problems. And 
    also on 2D segmentation tasks nnU-Net cam perform extremely well! We have, for example, won a 2D task in the cell 
    tracking challenge with nnU-Net (see our Nature Methods paper) and we have also successfully applied nnU-Net to 
    histopathological segmentation problems. 
    Working with 2D data in nnU-Net requires a small workaround in the creation of the dataset. Essentially, all images 
    must be converted to pseudo 3D images (so an image with shape (X, Y) needs to be converted to an image with shape 
    (1, X, Y). The resulting image must be saved in nifti format. Hereby it is important to set the spacing of the 
    first axis (the one with shape 1) to a value larger than the others. If you are working with niftis anyways, then 
    doing this should be easy for you. This example here is intended for demonstrating how nnU-Net can be used with 
    'regular' 2D images. We selected the massachusetts road segmentation dataset for this because it can be obtained 
    easily, it comes with a good amount of training cases but is still not too large to be difficult to handle.
    """

    # download dataset from https://www.kaggle.com/insaff/massachusetts-roads-dataset
    # extract the zip file, then set the following path according to your system:
    base = '/home/lus/myProject/codes/A_project/mmsegmentation/data/GOSAL_2022'
    # this folder should have the training and testing subfolders

    # now start the conversion to nnU-Net:
    task_name = 'Task001_OCTSeg'
    target_base = join(nnUNet_raw_data, task_name)
    target_imagesTr = join(target_base, "imagesTr")
    target_imagesTs = join(target_base, "imagesTs")
    target_labelsTs = join(target_base, "labelsTs")
    target_labelsTr = join(target_base, "labelsTr")

    maybe_mkdir_p(target_imagesTr)
    maybe_mkdir_p(target_labelsTs)
    maybe_mkdir_p(target_imagesTs)
    maybe_mkdir_p(target_labelsTr)

    # convert the training examples. Not all training images have labels, so we just take the cases for which there are
    # labels
    labels_dir_tr = join(base, 'annotations','training')
    images_dir_tr = join(base, 'images','training')
    training_cases = subfiles(labels_dir_tr, suffix='.png', join=False)
    for t in training_cases:
        unique_name = t[:-4]  # just the filename with the extension cropped away, so img-2.png becomes img-2 as unique_name
        input_segmentation_file = join(labels_dir_tr, t)
        input_image_file = join(images_dir_tr, t)

        output_image_file = join(target_imagesTr, unique_name)  # do not specify a file ending! This will be done for you
        output_seg_file = join(target_labelsTr, unique_name)  # do not specify a file ending! This will be done for you

        # this utility will convert 2d images that can be read by skimage.io.imread to nifti. You don't need to do anything.
        # if this throws an error for your images, please just look at the code for this function and adapt it to your needs
        convert_2d_image_to_nifti(input_image_file, output_image_file, is_seg=False)

        # the labels are stored as 0: background, 255: road. We need to convert the 255 to 1 because nnU-Net expects
        # the labels to be consecutive integers. This can be achieved with setting a transform


        convert_2d_image_to_nifti(input_segmentation_file, output_seg_file, is_seg=True,
                                  transform=lambda x: (x).astype(int))

    # now do the same for the test set
    labels_dir_ts = join(base, 'annotations', 'validation')
    images_dir_ts = join(base, 'images', 'validation')
    testing_cases = subfiles(labels_dir_ts, suffix='.png', join=False)
    for ts in testing_cases:
        unique_name = ts[:-4]
        input_segmentation_file = join(labels_dir_ts, ts)
        input_image_file = join(images_dir_ts, ts)

        output_image_file = join(target_imagesTs, unique_name)
        output_seg_file = join(target_labelsTs, unique_name)

        convert_2d_image_to_nifti(input_image_file, output_image_file, is_seg=False)
        convert_2d_image_to_nifti(input_segmentation_file, output_seg_file, is_seg=True,
                                  transform=lambda x: (x).astype(int))

    # finally we can call the utility for generating a dataset.json
    generate_dataset_json(join(target_base, 'dataset.json'), target_imagesTr, target_imagesTs, ('Red', 'Green', 'Blue'),
                          labels={0: 'background', 1: 'choroid',2:"GCIPL",3:"RNFL"}, dataset_name=task_name, license='hands off!')

    """
    once this is completed, you can use the dataset like any other nnU-Net dataset. Note that since this is a 2D
    dataset there is no need to run preprocessing for 3D U-Nets. You should therefore run the 
    `nnUNet_plan_and_preprocess` command like this:
    
    > nnUNet_plan_and_preprocess -t 120 -pl3d None
    
    once that is completed, you can run the trainings as follows:
    > nnUNet_train 2d nnUNetTrainerV2 120 FOLD
    
    (where fold is again 0, 1, 2, 3 and 4 - 5-fold cross validation)
    
    there is no need to run nnUNet_find_best_configuration because there is only one model to choose from.
    Note that without running nnUNet_find_best_configuration, nnU-Net will not have determined a postprocessing
    for the whole cross-validation. Spoiler: it will determine not to run postprocessing anyways. If you are using
    a different 2D dataset, you can make nnU-Net determine the postprocessing by using the
    `nnUNet_determine_postprocessing` command
    """

上述需要修改的地方有以下几个部分

原始数据的位置和task name

   # download dataset from https://www.kaggle.com/insaff/massachusetts-roads-dataset
    # extract the zip file, then set the following path according to your system:
    base = '/home/lus/myProject/codes/A_project/mmsegmentation/data/GOSAL_2022'
    # this folder should have the training and testing subfolders
    
    # now start the conversion to nnU-Net:
    task_name = 'Task001_OCTSeg'

训练集的子路径和测试验证集的子路径

    # convert the training examples. Not all training images have labels, so we just take the cases for which there are
    # labels
    labels_dir_tr = join(base, 'annotations','training')
    images_dir_tr = join(base, 'images','training')
    training_cases = subfiles(labels_dir_tr, suffix='.png', join=False)

    # now do the same for the test set
    labels_dir_ts = join(base, 'annotations', 'validation')
    images_dir_ts = join(base, 'images', 'validation')
    testing_cases = subfiles(labels_dir_ts, suffix='.png', join=False)

类别信息

generate_dataset_json(join(target_base, 'dataset.json'), target_imagesTr, target_imagesTs, ('Red', 'Green', 'Blue'),
                          labels={0: 'background', 1: 'choroid',2:"GCIPL",3:"RNFL"}, dataset_name=task_name, license='hands off!')

标签的数值转换函数
其中可以自己写个类别转换函数，transform，这里的lambda是个转换函数

        convert_2d_image_to_nifti(input_segmentation_file, output_seg_file, is_seg=True,
                                  transform=lambda x: (x).astype(int))

数据转化和数据预处理

export nnUNet_raw_data_base="/home/lus/Project/datasets/nnUNet/nnUNet_raw_data_base"
export nnUNet_preprocessed="/home/lus/Project/datasets/nnUNet/nnUNet_preprocessed"
export RESULTS_FOLDER="/home/lus/Project/datasets/nnUNet/nnUNet_trained_models"
python nnunet/dataset_conversion/Task001_OCT.py
nnUNet_plan_and_preprocess -t "001" --verify_dataset_integrity

python nnunet/dataset_conversion/Task001_OCT.py将数据转化成nnUNet需要的数据格式
在这里插入图片描述

nnUNet_plan_and_preprocess -t “001” --verify_dataset_integrity是数据预处理
在这里插入图片描述

训练

最终的训练shell

export nnUNet_raw_data_base="/home/lus/Project/datasets/nnUNet/nnUNet_raw_data_base"
export nnUNet_preprocessed="/home/lus/Project/datasets/nnUNet/nnUNet_preprocessed"
export RESULTS_FOLDER="/home/lus/Project/datasets/nnUNet/nnUNet_trained_models"
# python nnunet/dataset_conversion/Task001_OCT.py
# nnUNet_plan_and_preprocess -t "001" --verify_dataset_integrity
export CUDA_VISIBLE_DEVICES=1
nnUNet_train 2d nnUNetTrainerV2 'Task001_OCTSeg' 0 --npz
nnUNet_train 2d nnUNetTrainerV2 'Task001_OCTSeg' 1 --npz
nnUNet_train 2d nnUNetTrainerV2 'Task001_OCTSeg' 2 --npz
nnUNet_train 2d nnUNetTrainerV2 'Task001_OCTSeg' 3 --npz
nnUNet_train 2d nnUNetTrainerV2 'Task001_OCTSeg' 4 --npz

其中 export path(路径) 是配置数据路径，因为我没有将路径写入bashrc中，所以每次要申明路径。
export CUDA_VISIBLE_DEVICES 是配置哪个GPU被使用

这里依次训练五折交叉验证！！！

如果遇到训练中断，使用断电继续训练

因为训练每50epoch会保存一下断点，在终端训练命令后面添加 -c 就可以接着训练断点开始继续训练。

nnUNet_train 2d nnUNetTrainerV2 'Task001_OCTSeg' 0 --npz -c

!!记得先配置路径

测试

参考测试连接

确定最佳 U-Net 配置

训练完所有模型后，使用以下命令自动确定用于测试集预测的 U-Net 配置：

nnUNet_find_best_configuration -m 2d 3d_fullres 3d_lowres 3d_cascade_fullres -t XXX

（所有指定配置都需要完成5折！）

在未配置级联的数据集上，请改用 -m 2d 3d_fullres。如果您只想探索配置的某些子集，可以使用 -m 命令指定。其他选项可用（使用 -h 寻求帮助）。

运行推理

请记住，位于输入文件夹中的数据必须符合此处指定的格式。

nnUNet_find_best_configuration 将使用您需要使用的推理命令向终端打印一个字符串。运行推理的最简单方法是简单地使用这些命令。

nnUNet_find_best_configuration -m 2d  -t 001

如果您希望手动指定用于推理的配置，请使用以下命令：

对于每个所需的配置，运行：

nnUNet_predict -i INPUT_FOLDER -o OUTPUT_FOLDER -t TASK_NAME_OR_ID -m CONFIGURATION --save_npz

export nnUNet_raw_data_base="/home/lus/Project/datasets/nnUNet/nnUNet_raw_data_base"
export nnUNet_preprocessed="/home/lus/Project/datasets/nnUNet/nnUNet_preprocessed"
export RESULTS_FOLDER="/home/lus/Project/datasets/nnUNet/nnUNet_trained_models"
nnUNet_predict -i /home/lus/Project/datasets/nnUNet/nnUNet_raw_data_base/nnUNet_raw_data/Task001_OCTSeg/imagesTs -o /home/lus/Project/datasets/nnUNet/output -t 001 -m 2d -f 1

如果您打算使用集成，请仅指定 --save_npz。 --save_npz 将使命令将 softmax 概率与需要大量磁盘空间的预测分段掩码一起保存。

请为每个配置选择一个单独的 OUTPUT_FOLDER！

如果您希望运行集成，可以使用以下命令集成来自多个配置的预测：

nnUNet_ensemble -f FOLDER1 FOLDER2 ... -o OUTPUT_FOLDER -pp POSTPROCESSING_FILE

您可以指定任意数量的文件夹，但请记住，每个文件夹都需要包含由 nnUNet_predict 生成的 npz 文件。对于集成，您还可以指定一个文件来告诉命令如何进行后处理。这些文件是在运行 nnUNet_find_best_configuration 时创建的，位于相应的训练模型目录中（RESULTS_FOLDER/nnUNet/CONFIGURATION/TaskXXX_MYTASK/TRAINER_CLASS_NAME__PLANS_FILE_IDENTIFIER/postprocessing.json 或 RESULTS_FOLDER/nnUNet/ensemble/TaskXXX_MYTASK/ensemble_X__Y__Z–X__Y__Z/postprocessing.json）。您也可以选择不提供文件（只需省略 -pp）并且 nnU-Net 不会运行后处理。

请注意，默认情况下，将对所有可用折叠进行推理。我们强烈建议您使用所有 5 折。因此，在运行推理之前，必须对所有 5 个折叠进行训练。找到的可用折叠 nnU-Net 列表将在推理开始时打印。

将3D预测结果转回2D数据

3DDataProcessTo2D.py

import os
from tqdm import tqdm
import SimpleITK as sitk
import cv2
import numpy as np



def conver(img_dir, output_dir):
    os.makedirs(output_dir, exist_ok=True)
    img_list = [i for i in os.listdir(img_dir) if ".nii.gz" in i]
    with tqdm(img_list, desc="conver") as pbar:
        for name in pbar:
            image = sitk.ReadImage(os.path.join(img_dir, name))
            image = sitk.GetArrayFromImage(image)[0]
            image [image == 0]=255
            image [image == 1]=160
            image [image == 2]=80
            image [image == 3]=0

            image = image.astype(np.uint8)
            cv2.imwrite(os.path.join(output_dir, name.split(".")[0]+".png"), image)




if __name__ == "__main__":
    img_dir = "/home/lus/Project/datasets/nnUNet/output"
    output_dir = "/home/lus/Project/datasets/nnUNet/output_convert"
    conver(img_dir, output_dir)