医疗图像分割指标

医疗图像其中两种图像格式:MRI(Magnetic Resonance Imaging,磁共振成像)、CT(Computed Tomography,计算机断层),常存成 .nii.gz 格式。都是 3D 的 H × W × L H \times W \times L H×W×L,但不是 RGB,而是类似 grey-scale 图只有一个 channel,如 CT 的值可能是 Hu 值,取值范围可从负几千到正几千,参考 [1]。segmentation label 也相应是 3D 的。
med-planes
这里记录几种分割指标,参考 [2,3],可调 medpy 的包,其文档也有指标列表。记有 C + 1 个类,0 固定是 background,(某个数据的)prediction 和 label Y ^ , Y ∈ { 0 , … , C } H × W × L \hat{Y},Y\in \{0, \dots, C\}^{H \times W \times L} Y^,Y{0,,C}H×W×L,只留下第 c 类的 binary prediction 和 label 为 B ^ c , B c ∈ { 0 , 1 } H × W × L \hat{B}^c,B^c \in \{0, 1\}^{H \times W \times L} B^c,Bc{0,1}H×W×L ∣ ⋅ ∣ |\cdot| 表示非零元素个数, B ^ c , B c \hat{B}^c,B^c B^c,Bc 的 surface / boundray 记为 S ^ c , S c \hat{S}^c, S^c S^c,Sc,是其表面 voxels 的三维索引向量(下标)的集合。

Dice Coefficient

即 F1-scoure[2]。第 c 类的 dice 系数: D C c = 2 ∣ B ^ c ∩ B c ∣ ∣ B ^ c ∣ + ∣ B c ∣ DC_c=\frac{2|\hat{B}^c \cap B^c|}{|\hat{B}^c| + |B^c|} DCc=B^c+Bc2∣B^cBc 调包:medpy.metric.binary.dc

IoU

即 Jaccard 系数[2]。第 c 类的 IoU: I o U c = ∣ B ^ c ∩ B c ∣ ∣ B ^ c ∪ B c ∣ IoU_c = \frac{|\hat{B}^c \cap B^c|}{|\hat{B}^c \cup B^c|} IoUc=B^cBcB^cBc 调包:medpy.metric.binary.jc。由 [3],IoU 与 dice 系数有确定的转换关系,即等价,所以两者应该用其中一种就行。C 类的 IoU 取平均就是 mIoU: m I o U = ∑ c = 0 C I o U c C + 1 mIoU=\frac{\sum_{c=0}^{C}IoU_c}{C + 1} mIoU=C+1c=0CIoUc

  • (2023.11.23)代码可选包不包含 background。
  • (2023.10.17)[6,7] 都是算上 background,此处跟它们。
  • (2023.9.28)不确定算 mIoU 时要不要加上 background。

Frequency Weighted IoU

[5] 用到,但其加权系数的好像写错了,应参照 [6,7] 的公式。而 [5] 本身的实现 与 [6-8] 一致,是对的。公式应为: f w I o U c = ∣ B c ∣ H W L I o U c f w I o U = ∑ c = 0 C f w I o U c \begin{aligned} fwIoU_c &= \frac{|B^c|}{HWL}IoU_c \\ fwIoU &= \sum_{c=0}^{C}fwIoU_c \end{aligned} fwIoUcfwIoU=HWLBcIoUc=c=0CfwIoUc 注意此时只求和,不再除以 C + 1。另 [6] 有一份 tensorflow 版。

Sensitivity

即 recall,也叫 TPR(True Positive Rate),非对称。第 c 类的 sensitivity: R c = ∣ B ^ c ∩ B c ∣ ∣ B c ∣ R_c=\frac{|\hat{B}^c \cap B^c|}{|B^c|} Rc=BcB^cBc 调包:medpy.metric.binary.sensitivitymedpy.metric.binary.recall

Specificity

也叫 selectivity、TNR(True Negative Rate),是 sensitivity 的补。第 c 类的 specifity: S P c = ∣ ( 1 − B ^ c ) ∩ ( 1 − B c ) ∣ ∣ 1 − B c ∣ SP_c = \frac{|(1-\hat{B}^c) \cap (1-B^c)|}{|1-B^c|} SPc=∣1Bc(1B^c)(1Bc) 调包:medpy.metric.binary.specificity

(Pixel) Accuracy

[5] 用到 a c c = 1 ( Y ^ = Y ) H W L acc=\frac{1(\hat{Y} = Y)}{HWL} acc=HWL1(Y^=Y)

Average (Symmetric) Surface Distance

由 [4],第 c 类的 average surface distance: A S D ( S ^ c , S c ) = ∑ p ∈ S ^ c d s ( p , S c ) / ∣ S ^ c ∣ ≠ A S D ( S c , S ^ c ) ASD(\hat{S}^c, S^c) = \sum_{p \in \hat{S}^c} d_s(p,S^c) \big/ |\hat{S}^c| \ne ASD(S^c, \hat{S}^c) ASD(S^c,Sc)=pS^cds(p,Sc)/S^c=ASD(Sc,S^c) 其中 d s ( ⋅ , ⋅ ) d_s(\cdot,\cdot) ds(,) 是点到点集(此处是表面 S)距离: d s ( p , S ) = min ⁡ q ∈ S d ( p , q ) d_s(p,S)=\min_{q \in S} d(p,q) ds(p,S)=qSmind(p,q) p , q ∈ { 1 , … , W } × { 1 , … , H } × { 1 , … , L } p,q \in \{1,\dots,W\} \times \{1,\dots,H\} \times \{1,\dots,L\} p,q{1,,W}×{1,,H}×{1,,L} 是三维索引向量(下标), d ( ⋅ , ⋅ ) d(\cdot,\cdot) d(,) 是欧氏距离。就是对 S ^ \hat{S} S^ 中属于此 object 表面的每一个 voxel,都算一下它到 S 的距离,然后取平均。调包:medpy.metric.binary.asd

ASD 不对称,一般会用 average symmetric surface distance: A S S D ( S ^ c , S c ) = A S D ( S ^ c , S c ) + A S D ( S c , S ^ c ) 2 = A S S D ( S c , S ^ c ) ASSD(\hat{S}^c, S^c) = \frac{ASD(\hat{S}^c, S^c) + ASD(S^c, \hat{S}^c)}{2} = ASSD(S^c, \hat{S}^c) ASSD(S^c,Sc)=2ASD(S^c,Sc)+ASD(Sc,S^c)=ASSD(Sc,S^c) 调包:medpy.metric.binary.assd。有些文章也会将 ASSD 简叫成 ASD,感觉可以无脑只用 ASSD。

Hausdorff Distance

由 [2,4],第 c 类的 Hausdorff 距离: H D ( S ^ c , S c ) = max ⁡ { sup ⁡ p ∈ S ^ c d s ( p , S c ) , sup ⁡ p ∈ S c d s ( p , S ^ c ) } HD(\hat{S}^c,S^c)=\max\left\{ \sup_{p \in \hat{S}^c} d_s(p, S^c), \sup_{p \in S^c} d_s(p, \hat{S}^c) \right\} HD(S^c,Sc)=max{pS^csupds(p,Sc),pScsupds(p,S^c)} 调包:medpy.metric.binary.hd

Calibration

从公式看,有些指标在遇见特殊情况(全零 prediction、全零 label、全一 label)可能会有除零问题,一例见下文 example: error from distance metrics 小节。现测试 medpy.metric.binary 各 API 对这些突殊情况作何处理,以便校正不合理之处。

import numpy as np
import medpy.metric.binary as mmb

np.bool = np.bool_ # 新的 numpy 已无 np.bool

p0 = np.array([0, 0, 0, 0]) # all 0
p1 = np.array([1, 1, 1, 1]) # all 1
p  = np.array([0, 1, 1, 0]) # both 0 & 1
y0 = p0
y1 = p1
y  = 1 - p

print("\tdice")
print(mmb.dc(p0, y0), mmb.dc(p0, y), mmb.dc(p, y0))

print("\tiou")
try:
    print(mmb.jc(p0, y0))
except ZeroDivisionError as e:
    print("iou(p0, y0):", e)
print(mmb.jc(p0, y), mmb.jc(p, y0))

print("\tsensitivity")
print(mmb.sensitivity(p0, y0), mmb.sensitivity(p0, y), mmb.sensitivity(p, y0))

print("\tspecificity")
print(mmb.specificity(p1, y1), mmb.specificity(p1, y), mmb.specificity(p, y1))

# ASD 不对称!此处只依上文公式调包,即 pred 在前、label 在后
print("\tasd")
try:
    print(mmb.asd(p0, y0))
except RuntimeError as e:
    print("asd(p0, y0):", e)
try:
    print(mmb.asd(p0, y))
except RuntimeError as e:
    print("asd(p0, y):", e)
try:
    print(mmb.asd(p, y0))
except RuntimeError as e:
    print("asd(p, y0):", e)

print("\tassd")
try:
    print(mmb.assd(p0, y0))
except RuntimeError as e:
    print("assd(p0, y0):", e)
try:
    print(mmb.assd(p0, y))
except RuntimeError as e:
    print("assd(p0, y):", e)
try:
    print(mmb.assd(p, y0))
except RuntimeError as e:
    print("assd(p, y0):", e)

print("\thd")
try:
    print(mmb.hd(p0, y0))
except RuntimeError as e:
    print("hd(p0, y0):", e)
try:
    print(mmb.hd(p0, y))
except RuntimeError as e:
    print("hd(p0, y):", e)
try:
    print(mmb.hd(p, y0))
except RuntimeError as e:
    print("hd(p, y0):", e)

print("\thd95")
try:
    print(mmb.hd95(p0, y0))
except RuntimeError as e:
    print("hd95(p0, y0):", e)
try:
    print(mmb.hd95(p0, y))
except RuntimeError as e:
    print("hd95(p0, y):", e)
try:
    print(mmb.hd95(p, y0))
except RuntimeError as e:
    print("hd95(p, y0):", e)

输出:

        dice
0.0 0.0 0.0
        iou
iou(p0, y0): float division by zero
0.0 0.0
        sensitivity
0.0 0.0 0.0
        specificity
0.0 0.0 0.0
        asd
asd(p0, y0): The first supplied array does not contain any binary object.
asd(p0, y): The first supplied array does not contain any binary object.
asd(p, y0): The second supplied array does not contain any binary object.
        assd
assd(p0, y0): The first supplied array does not contain any binary object.
assd(p0, y): The first supplied array does not contain any binary object.
assd(p, y0): The second supplied array does not contain any binary object.
        hd
hd(p0, y0): The first supplied array does not contain any binary object.
hd(p0, y): The first supplied array does not contain any binary object.
hd(p, y0): The second supplied array does not contain any binary object.
        hd95
hd95(p0, y0): The first supplied array does not contain any binary object.
hd95(p0, y): The first supplied array does not contain any binary object.
hd95(p, y0): The second supplied array does not contain any binary object.

可见 medpy.metric.binary 的处理分两类:

  • 置零:dice、sensitivity、specificity
  • 报错:iou、asd、assd、hd、hd95

分类讨论:

  • 当 prediction 与 label 一致时,即使是全零或全一,也应该视为预测正确,赋最优值,即 dice、iou、sensitivity、specificity 为 1,asd、assd、hd、hd95 为 0,这点 medpy.metric.binary 没处理好;
  • sensitivity 当 label 为空、prediction 非空时,应当为 0,medpy.metric.binary 的处理是对的;
  • asd、assd、hd、hd95 四个距离指标当 prediction 和 label 一方为空、一方非空时,没有标准解法,可能这是 medpy.metric.binary 报错、MONAI 只抛 warning 不处理 的原因。本文选择置 NaN,求平均时忽略不计,但同时记录 NaN 的个数。

example: error from distance metrics

  • monai 1.3.0

算 assd 时,如果 y_true 为空,会报错:

Traceback (most recent call last):
  File "/home/tom/codes/monai-tutorials/2d_segmentation/torch/mmwhs_train.py", line 268, in <module>
    main(tempdir)
  File "/home/tom/codes/monai-tutorials/2d_segmentation/torch/mmwhs_train.py", line 246, in main
    _res = evaluate(val_outputs, val_labels, 4+1)
  File "/home/tom/codes/monai-tutorials/2d_segmentation/torch/mmwhs_train.py", line 66, in my_evaluate
    res[metr].append(fn(B_pred_c, B_c))
  File "/share/tom/miniconda3/envs/cu116_pt1131/lib/python3.9/site-packages/medpy/metric/binary.py", line 453, in assd
    assd = numpy.mean( (asd(result, reference, voxelspacing, connectivity), asd(reference, result, voxelspacing, connectivity)) )
  File "/share/tom/miniconda3/envs/cu116_pt1131/lib/python3.9/site-packages/medpy/metric/binary.py", line 561, in asd
    sds = __surface_distances(result, reference, voxelspacing, connectivity)
  File "/share/tom/miniconda3/envs/cu116_pt1131/lib/python3.9/site-packages/medpy/metric/binary.py", line 1215, in __surface_distances
    raise RuntimeError('The second supplied array does not contain any binary object.')
RuntimeError: The second supplied array does not contain any binary object.

其它 distance metrics(asd、hd、hd95)只要 prediction 和 label 其中一方为空也都会遇到。好像没有标准解法:

  • [11] 中 eval_UDA.py 是报错就将 assd 置 1;
  • MONAI[12] 只抛 warning 不处理;
  • [13] 在 prediction、ground-truth 皆空时置零,只其中一方空时置 128(而 128 恰好是 [13] 中设定的 crop patch size,不知与此有无关系);
  • [14] 有置零、373.128664、NaN(即忽略) 的方案。

Code

  • (2024.1.18)重构代码,写成类以便调用,后文对拍测试亦相应更新。
  • (2023.11.23)改了代码,现跟 MONAI[12] 在 2D 数据对拍,(基本)一致。
    distance metrics 的问题仍未解决,似乎没有标准做法,相关讨论见 [14-16]。现在 Y_predY_true 皆空时,学 [13] 置零;而只其中一方为空时,[13] 置 128、[14] 有置 0373.128664、NaN 的,不知道什么鬼,此处选置 NaN,即忽略。
    MONAI 的 distance metrics 默认忽略 background,此处强制忽略;而其它 metrics 就可选忽不忽略。
  • (2023.11.21)asd、assd、hd、hd95 等 distance metrics 有潜在报错,见后文,未完全修复
  • 分割模型应该一般 predict 的都是 (C+1)-way softmax prediction,即 Y ^ \hat{Y} Y^
import packaging.version
import numpy as np
import medpy.metric.binary as mmb

# https://stackoverflow.com/questions/74893742/how-to-solve-attributeerror-module-numpy-has-no-attribute-bool
if packaging.version.parse(np.__version__) >= packaging.version.parse('1.24'):
    if hasattr(np, 'bool_'):
        np.bool = np.bool_
    else:
        np.bool = bool

class SegEvaluator:
    """segmentation evaluation based on medpy.metric.binary
    It will calibrate some results from medpy.metric.binary (see __call__).
    It will also record the number of NaN in distance-based metrics that are
    caused by empty prediction or label.
    Usage:
        ```python
        # 2D image evaluation
        evaluator = SegEvaluator(N_CLASSES, IGNORE_BACKGROUND)
        for images, labels in loader:
            predictions = model(images)         # [bs, c, h, w]
            predictions = predictions.argmax(1) # -> [bs, h, w]
            for pred, lab in zip(predictions, labels):
                evaluator(pred, lab) # eval each image
        print(evaluator.reduce())
        evaluator.reset()
        ```
    """

    METRICS = {
        "dice": mmb.dc,
        "iou": mmb.jc,
        "accuracy": lambda _B1, _B2: (_B1 == _B2).sum() / _B1.size,
        "sensitivity": mmb.sensitivity,
        "specificity": mmb.specificity,
        "hd": mmb.hd,
        "assd": mmb.assd,
        "hd95": mmb.hd95,
        "asd": mmb.asd
    }

    def __init__(self, n_classes, ignore_bg=False, select=[]):
        """
        Input:
            n_classes: int, assuming 0 is background
            ignore_bg: bool, ignore background or not in reduction
            select: List[str], list of name of metrics of interest,
                i.e. will only evaluate on these selected metrics if provided.
                Should be a subset of supported metrics (see METRICS).
        """
        self.n_classes = n_classes
        self.ignore_bg = ignore_bg

        self.metrics = {}
        _no_select = len(select) == 0
        for m, f in SegEvaluator.METRICS.items():
            if _no_select or m in select:
                self.metrics[m] = f

        self.reset()

    def reset(self):
        # records:
        #  - records[metr][c][i] = <metr> score of i-th datum on c-th class, or
        #  - records[metr][c] = # of NaN caused by empty pred/label
        self.records = {}
        for metr in self.metrics:
            # self.records[metr] = [[]] * self.n_classes # wrong
            self.records[metr] = [[] for _ in range(self.n_classes)]
        for metr in ("hd", "assd", "hd95", "asd"):
            if metr in self.metrics:
                self.records[f"empty_gt_{metr}"] = [0] * self.n_classes
                self.records[f"empty_pred_{metr}"] = [0] * self.n_classes

    def __call__(self, Y_pred, Y_true):
        """evaluates 1 prediction
        Input:
            Y_pred: numpy.ndarray, typically [H, W] or [H, W, L], predicted class ID
            Y_true: same as `Y_pred`, label, ground-truth class ID
        """
        for c in range(self.n_classes):
            B_pred_c = (Y_pred == c).astype(np.int64)
            B_c      = (Y_true == c).astype(np.int64)
            pred_l0, pred_inv_l0, gt_l0, gt_inv_l0 = B_pred_c.sum(), (1 - B_pred_c).sum(), B_c.sum(), (1 - B_c).sum()
            for metr, fn in self.metrics.items():
                is_distance_metr = metr in ("hd", "assd", "hd95", "asd")
                if 0 == c and (self.ignore_bg or is_distance_metr):
                    # always ignore bg for distance metrics
                    a = np.nan
                elif 0 == gt_l0 and 0 == pred_l0 and metr in ("dice", "iou", "sensitivity"):
                    a = 1
                elif 0 == gt_inv_l0 and 0 == pred_inv_l0 and "specificity" == metr:
                    a = 1
                elif is_distance_metr and pred_l0 * gt_l0 == 0: # at least one party is all 0
                    if 0 == pred_l0 and 0 == gt_l0: # both are all 0
                        # nips23a&d, xmed-lab/GenericSSL
                        a = 0
                    else: # only one party is all 0
                        a = np.nan
                        if 0 == pred_l0:
                            self.records[f"empty_pred_{metr}"][c] += 1
                        else: # 0 == gt_l0
                            self.records[f"empty_gt_{metr}"][c] += 1
                else: # normal cases or that medpy can solve well
                    # try:
                    a = fn(B_pred_c, B_c)
                    # except:
                    #     a = np.nan

                self.records[metr][c].append(a)

    def reduce(self, prec=4):
        """calculate class-wise & overall average
        Input:
            prec: int, decimal precision
        Output:
            res: dict
                - res[<metr>]: float, overall average
                - res[<metr>_clswise]: List[float], class-wise average of each class
                - res[empty_pred|gt_<metr>]: int, overall #NaN caused by empty pred/label
                - res[empty_pred|gt_<metr>_clswise]: List[int], class-wise #NaN
        """
        res = {}
        for metr in self.records:
            if metr.startswith("empty_"):
                res[metr+"_clswise"] = self.records[metr]
                res[metr] = int(np.sum(self.records[metr]))
            else:
                CxN = np.asarray(self.records[metr], dtype=np.float32)
                nans = np.isnan(CxN)
                CxN[nans] = 0
                not_nans = ~nans

                # class-wise average
                cls_n = not_nans.sum(1) # [c]
                # cls_avg = np.where(cls_n > 0, CxN.sum(1) / cls_n, 0)
                _cls_n_denom = cls_n.copy()
                _cls_n_denom[0 == _cls_n_denom] = 1 # to avoid warning though not necessary
                cls_avg = np.where(cls_n > 0, CxN.sum(1) / _cls_n_denom, 0)
                res[f"{metr}_clswise"] = np.round(cls_avg, prec).tolist()

                # overall average
                ins_cls_n = not_nans.sum(0) # [n]
                # ins_avg = np.where(ins_cls_n > 0, CxN.sum(0) / ins_cls_n, 0)
                _ins_cls_n_denom = ins_cls_n.copy()
                _ins_cls_n_denom[0 == _ins_cls_n_denom] = 1 # to avoid warning though not necessary
                ins_avg = np.where(ins_cls_n > 0, CxN.sum(0) / _ins_cls_n_denom, 0)
                ins_n = (ins_cls_n > 0).sum()
                avg = ins_avg.sum() / ins_n if ins_n > 0 else 0
                res[metr] = float(np.round(avg, prec))

        return res

compare to MONAI

  • MedPy 0.4.0
  • monai 1.3.0

2D

  • 加背景共 5 类
import os, os.path as osp
import numpy as np
import medpy.metric.binary as mmb
from collections import defaultdict
import torch
from monai.networks.utils import one_hot
from monai.metrics import DiceMetric, MeanIoU, GeneralizedDiceScore, ConfusionMatrixMetric, HausdorffDistanceMetric, SurfaceDistanceMetric


class SegEvaluator:
    """同上文"""
    pass


print("自制 pred、label,保证非空以对拍 asd、assd、hd、hd95")
nc = 4+1
gen_data_size = [100, 40*(nc-1)]
gen_data_list = []
for i in range(30):
    pred = np.zeros(gen_data_size, dtype=np.uint8)
    label = pred.copy()
    for ic in range(1, nc):
        fill_v = np.concatenate([np.zeros([70*40], dtype=np.uint8), ic * np.ones([30*40], dtype=np.uint8)])
        np.random.shuffle(fill_v)
        pred[:, (ic-1)*40: ic*40] = fill_v.reshape(100, 40)
        if i > 2: # let the 1st 3 be perfect predictions
            np.random.shuffle(fill_v)
        label[:, (ic-1)*40: ic*40] = fill_v.reshape(100, 40)
    gen_data_list.append((pred, label))
print(len(gen_data_list)) # 30


print("MONAI overall")
monai_metrics = {
    "dice": DiceMetric(include_background=True, reduction="mean", get_not_nans=False, ignore_empty=True, num_classes=4+1),
    "iou": MeanIoU(include_background=True, reduction="mean", get_not_nans=False, ignore_empty=True),
    "sensitivity": ConfusionMatrixMetric(include_background=True, metric_name='sensitivity', reduction="mean", compute_sample=True, get_not_nans=False),
    "specificity": ConfusionMatrixMetric(include_background=True, metric_name='specificity', reduction="mean", compute_sample=True, get_not_nans=False),
    "accuracy": ConfusionMatrixMetric(include_background=True, metric_name='accuracy', reduction="mean", get_not_nans=False),
    # 下面 distance metrics 忽略 background
    "hd": HausdorffDistanceMetric(include_background=False, reduction="mean", get_not_nans=False),
    "hd95": HausdorffDistanceMetric(include_background=False, percentile=95, reduction="mean", get_not_nans=False),
    "assd": SurfaceDistanceMetric(include_background=False, symmetric=True, reduction="mean", get_not_nans=False),
    "asd": SurfaceDistanceMetric(include_background=False, symmetric=False, reduction="mean", get_not_nans=False),
}

print("MONAI class-wise")
monai_metrics_cls = {
    "dice": DiceMetric(include_background=True, reduction="mean_batch", get_not_nans=False, ignore_empty=True, num_classes=4+1),
    "iou": MeanIoU(include_background=True, reduction="mean_batch", get_not_nans=False, ignore_empty=True),
    "sensitivity": ConfusionMatrixMetric(include_background=True, metric_name='sensitivity', reduction="mean_batch", compute_sample=True, get_not_nans=False),
    "specificity": ConfusionMatrixMetric(include_background=True, metric_name='specificity', reduction="mean_batch", compute_sample=True, get_not_nans=False),
    "accuracy": ConfusionMatrixMetric(include_background=True, metric_name='accuracy', reduction="mean_batch", get_not_nans=False),
    # 下面 distance metrics 忽略 background
    "hd": HausdorffDistanceMetric(include_background=False, reduction="mean_batch", get_not_nans=False),
    "hd95": HausdorffDistanceMetric(include_background=False, percentile=95, reduction="mean_batch", get_not_nans=False),
    "assd": SurfaceDistanceMetric(include_background=False, symmetric=True, reduction="mean_batch", get_not_nans=False),
    "asd": SurfaceDistanceMetric(include_background=False, symmetric=False, reduction="mean_batch", get_not_nans=False),
}

print("medpy")
seg_eval = SegEvaluator(4+1, False)


print("2D seg,一张张测")
for pred, label in gen_data_list:
    pred = pred.astype(np.int32)
    label = label.astype(np.int32)
    # print(pred.shape, pred.dtype, np.unique(pred), label.shape, label.dtype, np.unique(label))

    seg_eval(pred, label)

    # MONAI 要求 [B, C, H, W],且有些指标要求 one-hot
    pred = one_hot(torch.from_numpy(pred).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
    label = one_hot(torch.from_numpy(label).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
    for _m in monai_metrics:
        monai_metrics[_m](y_pred=pred, y=label)
        monai_metrics_cls[_m](y_pred=pred, y=label)


print("monai")
for _m in monai_metrics:
    print('\t', _m)
    _agg = monai_metrics[_m].aggregate()#.item()
    _agg_cls = monai_metrics_cls[_m].aggregate()
    # print(type(_agg_cls))
    if isinstance(_agg, list):
        assert len(_agg) == 1
        _agg = _agg[0]
    if isinstance(_agg_cls, list):
        assert len(_agg_cls) == 1
        _agg_cls = _agg_cls[0]
    _agg = round(_agg.item(), 6)
    _agg_cls = [round(_x, 6) for _x in _agg_cls.tolist()]
    # monai_metrics[_m].reset()
    print("class-wise:", _agg_cls)
    print("overall:", _agg)


print("medpy")
medpy_res = {k: v for k, v in seg_eval.reduce(6).items()
	if not k.startswith("empty_")} # 虑掉那些 NaN 记数
for metr in medpy_res:
    if not metr.endswith("_clswise"):
        print('\t', metr)
        print("class-wise:", medpy_res[metr+"_clswise"])
        print("overall:", medpy_res[metr])

输出

monai
	 dice
class-wise: [0.819524, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 iou
class-wise: [0.699342, 0.197282, 0.200425, 0.199778, 0.197985]
overall: 0.298962
	 sensitivity
class-wise: [0.819524, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
	 accuracy
class-wise: [0.711238, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
	 hd
class-wise: [3.283388, 3.363868, 3.308639, 3.316445]
overall: 3.318085
	 hd95
class-wise: [2.012461, 2.004592, 2.012461, 2.012461]
overall: 2.010494
	 assd
class-wise: [0.945859, 0.939, 0.945332, 0.943413]
overall: 0.943401
	 asd
class-wise: [0.943933, 0.938989, 0.944227, 0.941334]
overall: 0.942121


medpy
	 dice
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 iou
class-wise: [0.699342, 0.197282, 0.200425, 0.199778, 0.197985]
overall: 0.298962
	 sensitivity
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
	 accuracy
class-wise: [0.711238, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
	 hd
class-wise: [0.0, 3.283388, 3.363868, 3.308639, 3.316445]
overall: 3.318085
	 hd95
class-wise: [0.0, 2.004592, 2.004592, 2.005379, 2.004592]
overall: 2.004789
	 assd
class-wise: [0.0, 0.945857, 0.939, 0.945332, 0.94341]
overall: 0.9434
	 asd
class-wise: [0.0, 0.943933, 0.938989, 0.944227, 0.941334]
overall: 0.942121

结论:hd95 同,其它一致。对比 medpy 的实现MONAI 的实现,是因为它们取 95% 分位数的方式不同:

  • medpy 是取两个方向的所有 voxel 的 surface distance 的 95% 分位数;
  • MONAI 是两个方向的 surface diatance 各取 95%分位数,再取两者最大值。

3D

  • 代码接上节
print("2D 辑成 3D")
gen_data_3d_list = []
for i in range(0, len(gen_data_list), 10):
    pred3d = np.concatenate([x[0][:, :, np.newaxis] for x in gen_data_list[i: i+10]], axis=2)
    label3d = np.concatenate([x[1][:, :, np.newaxis] for x in gen_data_list[i: i+10]], axis=2)
    print(pred3d.shape, label3d.shape) # (100, 160, 10) (100, 160, 10)
    gen_data_3d_list.append((pred3d, label3d))
print(len(gen_data_3d_list)) # 3


print("重置 metrics 记录")
seg_eval.reset()
for _m in monai_metrics:
    monai_metrics[_m].reset()
    monai_metrics_cls[_m].reset()


print("3D,逐 volume 测")
for pred, label in gen_data_3d_list:
    pred = pred.astype(np.int32)
    label = label.astype(np.int32)
    # print(pred.shape, pred.dtype, np.unique(pred), label.shape, label.dtype, np.unique(label))

    seg_eval(pred, label)

    pred = one_hot(torch.from_numpy(pred).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
    label = one_hot(torch.from_numpy(label).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
    for _m in monai_metrics:
        monai_metrics[_m](y_pred=pred, y=label)
        monai_metrics_cls[_m](y_pred=pred, y=label)


print("monai") # 同前面 2D
for _m in monai_metrics:
    print('\t', _m)
    _agg = monai_metrics[_m].aggregate()#.item()
    _agg_cls = monai_metrics_cls[_m].aggregate()
    # print(type(_agg_cls))
    if isinstance(_agg, list):
        assert len(_agg) == 1
        _agg = _agg[0]
    if isinstance(_agg_cls, list):
        assert len(_agg_cls) == 1
        _agg_cls = _agg_cls[0]

    _agg = round(_agg.item(), 6)
    _agg_cls = [round(_x, 6) for _x in _agg_cls.tolist()]
    # monai_metrics[_m].reset()
    print("class-wise:", _agg_cls)
    print("overall:", _agg)


print("medpy")
medpy_res = {k: v for k, v in seg_eval.reduce(6).items()
	if not k.startswith("empty_")} # 虑掉那些 NaN 记数
for metr in medpy_res:
    if not metr.endswith("_clswise"):
        print('\t', metr)
        print("class-wise:", medpy_res[metr+"_clswise"])
        print("overall:", medpy_res[metr])

输出

monai
	 dice
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 iou
class-wise: [0.695224, 0.165041, 0.168134, 0.168165, 0.165894]
overall: 0.272492
	 sensitivity
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
	 accuracy
class-wise: [0.711237, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
	 hd
class-wise: [2.307209, 2.44949, 2.575802, 2.44949]
overall: 2.445498
	 hd95
class-wise: [1.414214, 1.414214, 1.414214, 1.414214]
overall: 1.414214
	 assd
class-wise: [0.815341, 0.80984, 0.813648, 0.815224]
overall: 0.813514
	 asd
class-wise: [0.812973, 0.809779, 0.813435, 0.814375]
overall: 0.812641


medpy
	 dice
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 iou
class-wise: [0.695224, 0.165041, 0.168134, 0.168165, 0.165894]
overall: 0.272492
	 sensitivity
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
	 specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
	 accuracy
class-wise: [0.711238, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
	 hd
class-wise: [0.0, 2.307209, 2.44949, 2.575802, 2.44949]
overall: 2.445498
	 hd95
class-wise: [0.0, 1.414213, 1.414213, 1.414213, 1.414213]
overall: 1.414214
	 assd
class-wise: [0.0, 0.815341, 0.80984, 0.813648, 0.815224]
overall: 0.813514
	 asd
class-wise: [0.0, 0.812973, 0.809779, 0.813435, 0.814375]
overall: 0.812641

结论:

  • 跟 MONAI 一致,连 hd95 都一致;
  • 同样的数据,辑成 3D 测跟 2D 式逐 slice 测结果同(符合直觉,因为求平均的方式不同)。

References

  1. 亨氏单位
  2. 常用的医学图像分割评价指标
  3. 医学图像分割常用指标及代码(pytorch)
  4. (MIA 2022) Rethinking adversarial domain adaptation: Orthogonal decomposition for unsupervised domain adaptation in medical image segmentation - paper, github
  5. (ICML 2018) CyCADA: Cycle-Consistent Adversarial Domain Adaptation - paper, code
  6. 评价指标
  7. 语义分割之评价指标
  8. jfzhang95/pytorch-deeplab-xception
  9. xxxzhi/Freq weight IoU
  10. (arXiv 21) Weighted Intersection over Union (wIoU): A New Evaluation Metric for Image Segmentation - paper
  11. TFboys-lzz/MPSCL
  12. Project-MONAI/MONAI
  13. xmed-lab/GenericSSL
  14. Evaluation metric #39
  15. test_model指标计算 #5
  16. 性能咨询 #4
  17. python生成列表坑
  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
医学图像分割医疗领域中非常重要的任务之一,可以用于医学影像分析、辅助诊断等方面。而V-Net模型是一种非常流行的医学图像分割模型,其采用了3D卷积神经网络(CNN)进行图像分割。在使用V-Net模型进行医学图像分割时,需要进行一些参数的调整,以获得更好的分割效果。本篇文章将介绍使用V-Net模型进行医学图像分割时的参数调整方法。 一、数据预处理 在使用V-Net模型进行医学图像分割时,首先需要进行数据预处理。数据预处理是一项非常重要的任务,可以使得模型对数据的理解更加准确。数据预处理的步骤如下: 1. 数据归一化 归一化是一项非常重要的任务,可以使得数据的分布更加均匀,从而更有利于模型的训练。在使用V-Net模型进行医学图像分割时,通常采用Z-score标准化方法进行数据归一化。 2. 数据增强 数据增强是一项非常重要的任务,可以增加数据的多样性,从而提高模型的泛化能力。在使用V-Net模型进行医学图像分割时,通常采用随机旋转、随机缩放等方法进行数据增强。 二、模型参数调整 在进行数据预处理后,需要对模型参数进行调整。模型参数调整是一项非常重要的任务,可以使得模型对数据的理解更加准确,从而提高分割效果。模型参数调整的步骤如下: 1. 学习率 学习率是控制模型训练速度的重要参数。在使用V-Net模型进行医学图像分割时,通常采用较小的学习率进行模型训练。 2. Batch Size Batch Size是控制每次迭代所使用的样本数的重要参数。在使用V-Net模型进行医学图像分割时,通常采用较小的Batch Size进行模型训练。 3. Dropout Dropout是一种正则化方法,可以防止过拟合。在使用V-Net模型进行医学图像分割时,通常采用较小的Dropout值进行模型训练。 4. 损失函数 损失函数是评价模型性能的重要指标。在使用V-Net模型进行医学图像分割时,通常采用交叉熵损失函数进行模型训练。 三、模型训练与评价 在进行模型参数调整后,需要进行模型训练和评价。模型训练和评价是一项非常重要的任务,可以评估模型的性能。模型训练和评价的步骤如下: 1. 训练集和测试集的划分 训练集和测试集的划分是一项非常重要的任务,可以使得模型对未知数据的泛化能力更强。在使用V-Net模型进行医学图像分割时,通常采用80%的数据作为训练集,20%的数据作为测试集。 2. 模型训练 模型训练是一项非常重要的任务,可以使得模型对数据的理解更加准确,从而提高分割效果。在使用V-Net模型进行医学图像分割时,通常采用Adam优化器进行模型训练。 3. 模型评价 模型评价是一项非常重要的任务,可以评估模型的性能。在使用V-Net模型进行医学图像分割时,通常采用Dice系数进行模型评价。 四、总结 在本文中,我们介绍了使用V-Net模型进行医学图像分割时的参数调整方法。通过对数据预处理、模型参数调整、模型训练和评价等方面进行调整,可以获得更好的分割效果。在实际应用中,需要根据实际情况进行参数的调整,以获得更好的效果。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值