├── …
Within each dataset folder, the following structure is expected:
Dataset001_BrainTumour/
├── dataset.json
├── imagesTr
├── imagesTs # optional
└── labelsTr
When adding your custom dataset, take a look at the [dataset\_conversion](../nnunetv2/dataset_conversion) folder and
pick an id that is not already taken. IDs 001-010 are for the Medical Segmentation Decathlon.
* **imagesTr** contains the images belonging to the training cases. nnU-Net will perform pipeline configuration, training with
cross-validation, as well as finding postprocessing and the best ensemble using this data.
* **imagesTs** (optional) contains the images that belong to the test cases. nnU-Net does not use them! This could just
be a convenient location for you to store these images. Remnant of the Medical Segmentation Decathlon folder structure.
* **labelsTr** contains the images with the ground truth segmentation maps for the training cases.
* **dataset.json** contains metadata of the dataset.
The scheme introduced [above](#what-do-training-cases-look-like) results in the following folder structure. Given
is an example for the first Dataset of the MSD: BrainTumour. This dataset hat four input channels: FLAIR (0000),
T1w (0001), T1gd (0002) and T2w (0003). Note that the imagesTs folder is optional and does not have to be present.
nnUNet_raw/Dataset001_BrainTumour/
├── dataset.json
├── imagesTr
│ ├── BRATS_001_0000.nii.gz
│ ├── BRATS_001_0001.nii.gz
│ ├── BRATS_001_0002.nii.gz
│ ├── BRATS_001_0003.nii.gz
│ ├── BRATS_002_0000.nii.gz
│ ├── BRATS_002_0001.nii.gz
│ ├── BRATS_002_0002.nii.gz
│ ├── BRATS_002_0003.nii.gz
│ ├── …
├── imagesTs
│ ├── BRATS_485_0000.nii.gz
│ ├── BRATS_485_0001.nii.gz
│ ├── BRATS_485_0002.nii.gz
│ ├── BRATS_485_0003.nii.gz
│ ├── BRATS_486_0000.nii.gz
│ ├── BRATS_486_0001.nii.gz
│ ├── BRATS_486_0002.nii.gz
│ ├── BRATS_486_0003.nii.gz
│ ├── …
└── labelsTr
├── BRATS_001.nii.gz
├── BRATS_002.nii.gz
├── …
Here is another example of the second dataset of the MSD, which has only one input channel:
nnUNet_raw/Dataset002_Heart/
├── dataset.json
├── imagesTr
│ ├── la_003_0000.nii.gz
│ ├── la_004_0000.nii.gz
│ ├── …
├── imagesTs
│ ├── la_001_0000.nii.gz
│ ├── la_002_0000.nii.gz
│ ├── …
└── labelsTr
├── la_003.nii.gz
├── la_004.nii.gz
├── …
Remember: For each training case, all images must have the same geometry to ensure that their pixel arrays are aligned. Also
make sure that all your data is co-registered!
See also [dataset format inference](dataset_format_inference.md)!!
### dataset.json
The dataset.json contains metadata that nnU-Net needs for training. We have greatly reduced the number of required
fields since version 1!
Here is what the dataset.json should look like at the example of the Dataset005\_Prostate from the MSD:
{
“channel_names”: { # formerly modalities
“0”: “T2”,
“1”: “ADC”
},
“labels”: { # THIS IS DIFFERENT NOW!
“background”: 0,
“PZ”: 1,
“TZ”: 2
},
“numTraining”: 32,
“file_ending”: “.nii.gz”
“overwrite_image_reader_writer”: “SimpleITKIO” # optional! If not provided nnU-Net will automatically determine the ReaderWriter
}
The channel\_names determine the normalization used by nnU-Net. If a channel is marked as ‘CT’, then a global
normalization based on the intensities in the foreground pixels will be used. If it is something else, per-channel
z-scoring will be used. Refer to the methods section in [our paper]( )
for more details. nnU-Net v2 introduces a few more normalization schemes to
choose from and allows you to define your own, see [here](explanation_normalization.md) for more information.
Important changes relative to nnU-Net v1:
* “modality” is now called “channel\_names” to remove strong bias to medical images
* labels are structured differently (name -> int instead of int -> name). This was needed to support [region-based training](region_based_training.md)
* “file\_ending” is added to support different input file types
* “overwrite\_image\_reader\_writer” optional! Can be used to specify a certain (custom) ReaderWriter class that should
be used with this dataset. If not provided, nnU-Net will automatically determine the ReaderWriter
* “regions\_class\_order” only used in [region-based training](region_based_training.md)
There is a utility with which you can generate the dataset.json automatically. You can find it
[here](../nnunetv2/dataset_conversion/generate_dataset_json.py).
See our examples in [dataset\_conversion](../nnunetv2/dataset_conversion) for how to use it. And read its documentation!
### How to use nnU-Net v1 Tasks
If you are migrating from the old nnU-Net, convert your existing datasets with `nnUNetv2_convert_old_nnUNet_dataset`!
Example for migrating a nnU-Net v1 Task:
nnUNetv2_convert_old_nnUNet_dataset /media/isensee/raw_data/nnUNet_raw_data_base/nnUNet_raw_data/Task027_ACDC Dataset027_ACDC
Use `nnUNetv2_convert_old_nnUNet_dataset -h` for detailed usage instructions.
### How to use decathlon datasets
See <convert_msd_dataset.md>
### How to use 2D data with nnU-Net
2D is now natively supported (yay!). See [here](#supported-file-formats) as well as the example dataset in this
[script](../nnunetv2/dataset_conversion/Dataset120_RoadSegmentation.py).
### How to update an existing dataset
When updating a dataset it is best practice to remove the preprocessed data in `nnUNet_preprocessed/DatasetXXX_NAME`
to ensure a fresh start. Then replace the data in `nnUNet_raw` and rerun `nnUNetv2_plan_and_preprocess`. O