视频目标分割VOS的评价指标J&F

本文简要概括VOS任务中两个最重要的评价指标,即J&F(全称应该是Jaccard和F-Score)。其中J描述的是预测的mask和gt之间的IOU,F描述的是预测mask边界与gt边界之间的吻合程度。下面分别进行介绍:

Jaccard

J的计算其实非常简单,就是单纯的计算预测mask和gt mask之间的IOU,即一个比值的形式:分子是预测mask和gt这两张图foreground部分的交,而分母部分就是两者之间的并集。其代码实现如下:

def db_eval_iou(annotation,segmentation):

    """ Compute region similarity as the Jaccard Index.
    Arguments:
        annotation   (ndarray): binary annotation   map.
        segmentation (ndarray): binary segmentation map.
    Return:
        jaccard (float): region similarity
 """

    annotation   = annotation.astype(np.bool)
    segmentation = segmentation.astype(np.bool)

    if np.isclose(np.sum(annotation),0) and np.isclose(np.sum(segmentation),0):
        return 1
    else:
        return np.sum((annotation & segmentation)) / \
                np.sum((annotation | segmentation),dtype=np.float32)
F-score

F-score评估的是预测mask的边界是否与gt mask的边界对应。首先应提取预测mask和gt的边界元素坐标,将边界上的元素置为True,非边界的元素置为False。由于F-score的定义为:

F = 2 P R P + R F=\frac{2PR}{P+R} F=P+R2PR

P表示precision,即查准率;R表示recall,即查全率。其计算公式分别如下:

P = T P T P + F P P=\frac{TP}{TP+FP} P=TP+FPTP

R = T P T P + F N R=\frac{TP}{TP+FN} R=TP+FNTP

对于P的计算,分母应是预测mask的边界元素总数,分子则是在预测为边界的那些元素中真正属于gt的。换句话说,预测mask假设有100个元素为边界元素,但实际上可能只有70个存在于gt中,属于true positive,所以此时的查准率为70%。那么如何确定70这个数,也就是说如何确定有多少个预测为positive的元素属于true positive呢?这里采用了gt的边界(经过了一个binary_dilation的操作,感觉像是提升容错率),利用预测mask的边界和处理过后的gt边界做点乘,再通过sum即可计算true positive的个数。

同样地,对于R的计算,分母是gt mask的边界元素总数,分子表示多少个本质的正样本被预测出来。例如gt mask的边界有100个元素,但实际预测的mask中,只有70个真实的正样本被预测为positive,还有30个被误预测为negative,那么此时的recall为70%。具体计算是将预测mask的边界先进行binary_dilation,再用gt mask的边界和处理后的mask边界做点积,通过sum计算出true positive的个数。

上面的叙述还是比较晦涩的,一言来说,就是查准率P基于预测结果,判定这些预测为正的边界元素有多少真正的属于边界元素(参照gt);而查全率R是从标注的gt出发,我gt边界mask中正样本有N个,那么需要看看实际预测出来为正,且准确预测的元素有多少个(参照预测mask)。

这个衡量指标的算法如下:

def db_eval_boundary(foreground_mask,gt_mask,bound_th=0.008):
    """
    Compute mean,recall and decay from per-frame evaluation.
    Calculates precision/recall for boundaries between foreground_mask and
    gt_mask using morphological operators to speed it up.

    Arguments:
        foreground_mask (ndarray): binary segmentation image.
        gt_mask         (ndarray): binary annotated image.

    Returns:
        F (float): boundaries F-measure
        P (float): boundaries precision
        R (float): boundaries recall
    """
    assert np.atleast_3d(foreground_mask).shape[2] == 1

    bound_pix = bound_th if bound_th >= 1 else \
            np.ceil(bound_th*np.linalg.norm(foreground_mask.shape))

    # Get the pixel boundaries of both masks
    fg_boundary = seg2bmap(foreground_mask);
    gt_boundary = seg2bmap(gt_mask);

    from skimage.morphology import binary_dilation,disk

    fg_dil = binary_dilation(fg_boundary,disk(bound_pix))
    gt_dil = binary_dilation(gt_boundary,disk(bound_pix))

    # Get the intersection
    gt_match = gt_boundary * fg_dil
    fg_match = fg_boundary * gt_dil

    # Area of the intersection
    n_fg     = np.sum(fg_boundary)
    n_gt     = np.sum(gt_boundary)

    #% Compute precision and recall
    if n_fg == 0 and  n_gt > 0:
        precision = 1
        recall = 0
    elif n_fg > 0 and n_gt == 0:
        precision = 0
        recall = 1
    elif n_fg == 0  and n_gt == 0:
        precision = 1
        recall = 1
    else:
        precision = np.sum(fg_match)/float(n_fg)
        recall    = np.sum(gt_match)/float(n_gt)

    # Compute F measure
    if precision + recall == 0:
        F = 0
    else:
        F = 2*precision*recall/(precision+recall);

    return F

def seg2bmap(seg,width=None,height=None):
    """
    From a segmentation, compute a binary boundary map with 1 pixel wide
    boundaries.  The boundary pixels are offset by 1/2 pixel towards the
    origin from the actual segment boundary.

    Arguments:
        seg     : Segments labeled from 1..k.
        width	  :	Width of desired bmap  <= seg.shape[1]
        height  :	Height of desired bmap <= seg.shape[0]

    Returns:
        bmap (ndarray):	Binary boundary map.

     David Martin <dmartin@eecs.berkeley.edu>
     January 2003
 """

    seg = seg.astype(np.bool)
    seg[seg>0] = 1

    assert np.atleast_3d(seg).shape[2] == 1

    width  = seg.shape[1] if width  is None else width
    height = seg.shape[0] if height is None else height

    h,w = seg.shape[:2]

    ar1 = float(width) / float(height)
    ar2 = float(w) / float(h)

    assert not (width>w | height>h | abs(ar1-ar2)>0.01),\
            'Can''t convert %dx%d seg to %dx%d bmap.'%(w,h,width,height)

    e  = np.zeros_like(seg)
    s  = np.zeros_like(seg)
    se = np.zeros_like(seg)

    e[:,:-1]    = seg[:,1:]
    s[:-1,:]    = seg[1:,:]
    se[:-1,:-1] = seg[1:,1:]

    b        = seg^e | seg^s | seg^se
    b[-1,:]  = seg[-1,:]^e[-1,:]
    b[:,-1]  = seg[:,-1]^s[:,-1]
    b[-1,-1] = 0

    if w == width and h == height:
        bmap = b
    else:
        bmap = np.zeros((height,width))
        for x in range(w):
            for y in range(h):
                if b[y,x]:
                    j = 1+floor((y-1)+height / h)
                    i = 1+floor((x-1)+width  / h)
                    bmap[j,i] = 1;

    return bmap

代码具体细节有某些部分也有些疑惑,不过大致思路就是如上所述。

  • 8
    点赞
  • 25
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值