CV面试知识点总结--计算机视觉

GlassySky0816

已于 2022-05-06 06:29:50 修改

阅读量2k

点赞数

分类专栏：面经总结文章标签：计算机视觉面试总结

于 2022-05-04 22:56:48 首次发布

本文链接：https://blog.csdn.net/qq_38784098/article/details/124579557

版权

面经总结专栏收录该内容

3 篇文章 0 订阅

订阅专栏

交并比(Intersection over Union, IoU)

代码

import numpy as np

def get_IoU(pred_bbox, gt_bbox):
    """
    return iou score between pred / gt bboxes
    :param pred_bbox: predict bbox coordinate
    :param gt_bbox: ground truth bbox coordinate
    :return: iou score
    """

    # bbox should be valid, actually we should add more judgements, just ignore here...
    # assert ((abs(pred_bbox[2] - pred_bbox[0]) > 0) and
    #         (abs(pred_bbox[3] - pred_bbox[1]) > 0))
    # assert ((abs(gt_bbox[2] - gt_bbox[0]) > 0) and
    #         (abs(gt_bbox[3] - gt_bbox[1]) > 0))

    # -----0---- get coordinates of inters
    ixmin = max(pred_bbox[0], gt_bbox[0])
    iymin = max(pred_bbox[1], gt_bbox[1])
    ixmax = min(pred_bbox[2], gt_bbox[2])
    iymax = min(pred_bbox[3], gt_bbox[3])
    iw = np.maximum(ixmax - ixmin + 1., 0.)
    ih = np.maximum(iymax - iymin + 1., 0.)

    # -----1----- intersection
    inters = iw * ih

    # -----2----- union, uni = S1 + S2 - inters
    uni = ((pred_bbox[2] - pred_bbox[0] + 1.) * (pred_bbox[3] - pred_bbox[1] + 1.) +
           (gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) -
           inters)

    # -----3----- iou
    overlaps = inters / uni

    return overlaps


def get_max_IoU(pred_bboxes, gt_bbox):
    """
    given 1 gt bbox, >1 pred bboxes, return max iou score for the given gt bbox and pred_bboxes
    :param pred_bbox: predict bboxes coordinates, we need to find the max iou score with gt bbox for these pred bboxes
    :param gt_bbox: ground truth bbox coordinate
    :return: max iou score
    """

    # bbox should be valid, actually we should add more judgements, just ignore here...
    # assert ((abs(gt_bbox[2] - gt_bbox[0]) > 0) and
    #         (abs(gt_bbox[3] - gt_bbox[1]) > 0))

    if pred_bboxes.shape[0] > 0:
        # -----0---- get coordinates of inters, but with multiple predict bboxes
        ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0])
        iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1])
        ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2])
        iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3])
        iw = np.maximum(ixmax - ixmin + 1., 0.)
        ih = np.maximum(iymax - iymin + 1., 0.)

        # -----1----- intersection
        inters = iw * ih

        # -----2----- union, uni = S1 + S2 - inters
        uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
               (pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
               inters)

        # -----3----- iou, get max score and max iou index
        overlaps = inters / uni
        ovmax = np.max(overlaps)
        jmax = np.argmax(overlaps)

    return overlaps, ovmax, jmax

if __name__ == "__main__":

    # test1
    pred_bbox = np.array([50, 50, 90, 100])   # top-left: <50, 50>, bottom-down: <90, 100>, <x-axis, y-axis>
    gt_bbox = np.array([70, 80, 120, 150])
    print (get_IoU(pred_bbox, gt_bbox))
    
    # test2
    pred_bboxes = np.array([[15, 18, 47, 60],
                          [50, 50, 90, 100],
                          [70, 80, 120, 145],
                          [130, 160, 250, 280],
                          [25.6, 66.1, 113.3, 147.8]])
    gt_bbox = np.array([70, 80, 120, 150])
    print (get_max_IoU(pred_bboxes, gt_bbox))

均交并比(Mean Intersection over Union, mIOU)

每个类别上计算IoU值（即真正样本数量/（真正样本数量+假负样本数量+假正样本数量)）

代码

class Evaluator(object):
    def __init__(self, num_class):
        self.num_class = num_class
        self.confusion_matrix = np.zeros((self.num_class,)*2)#21*21的矩阵,行代表ground truth类别,列代表preds的类别,值代表
 
    '''
    正确的像素占总像素的比例
    '''
    def Pixel_Accuracy(self):
        Acc = np.diag(self.confusion_matrix).sum() / self.confusion_matrix.sum()
        return Acc
 
    '''
    分别计算每个类分类正确的概率
    '''
    def Pixel_Accuracy_Class(self):
        Acc = np.diag(self.confusion_matrix) / self.confusion_matrix.sum(axis=1)
        Acc = np.nanmean(Acc)
        return Acc
 
    '''
    Mean Intersection over Union(MIoU，均交并比)：为语义分割的标准度量。其计算两个集合的交集和并集之比.
    在语义分割的问题中，这两个集合为真实值（ground truth）和预测值（predicted segmentation）。
    这个比例可以变形为正真数（intersection）比上真正、假负、假正（并集）之和。在每个类上计算IoU，之后平均。
    
    对于21个类别,分别求IOU:
        例如,对于类别1的IOU定义如下:
            (1)统计在ground truth中属于类别1的像素数
            (2)统计在预测结果中每个类别1的像素数
                (1) + (2)就是二者的并集像素数(类比于两块区域的面积加和, 注:二者交集部分的面积加重复了)
                再减去二者的交集(既在ground truth集合中又在预测结果集合中的像素),得到的就是二者的并集(所有跟类别1有关系的像素:包括TP,FP,FN)
        扩展提示:
            TP(真正): 预测正确, 预测结果是正类, 真实是正类  
            FP(假正): 预测错误, 预测结果是正类, 真实是负类
            FN(假负): 预测错误, 预测结果是负类, 真实是正类
            
            TN(真负): 预测正确, 预测结果是负类, 真实是负类   #跟类别1无关,所以不包含在并集中
            (本例中, 正类:是类别1, 负类:不是类别1)
                
    mIoU:
        对于每个类别计算出的IoU求和取平均
    
    '''
    def Mean_Intersection_over_Union(self):
        MIoU = np.diag(self.confusion_matrix) / (
                    np.sum(self.confusion_matrix, axis=1) + np.sum(self.confusion_matrix, axis=0) -
                    np.diag(self.confusion_matrix))
        MIoU = np.nanmean(MIoU) #跳过0值求mean,shape:[21]
        return MIoU
 
    def Class_IOU(self):
        MIoU = np.diag(self.confusion_matrix) / (
                    np.sum(self.confusion_matrix, axis=1) + np.sum(self.confusion_matrix, axis=0) -
                    np.diag(self.confusion_matrix))
        return MIoU
 
    def Frequency_Weighted_Intersection_over_Union(self):
        freq = np.sum(self.confusion_matrix, axis=1) / np.sum(self.confusion_matrix)
        iu = np.diag(self.confusion_matrix) / (
                    np.sum(self.confusion_matrix, axis=1) + np.sum(self.confusion_matrix, axis=0) -
                    np.diag(self.confusion_matrix))
 
        FWIoU = (freq[freq > 0] * iu[freq > 0]).sum()
        return FWIoU
 
 
    '''
    参数的传入:
        evaluator = Evaluate(4)           #只需传入类别数4
        evaluator.add_batch(target, preb) #target:[batch_size, 512, 512]    ,    preb:[batch_size, 512, 512]
        在add_batch中统计这个epoch中所有图片的预测结果和ground truth的对应情况, 累计成confusion矩阵(便于之后求mean)
    
    
    参数列表对应:
        gt_image: target  图片的真实标签            [batch_size, 512, 512]
        per_image: preb   网络生成的图片的预测标签   [batch_size, 512, 512]
    
    parameters:
        mask: ground truth中所有正确(值在[0, classe_num])的像素label的mask---为了保证ground truth中的标签值都在合理的范围[0, 20]
        label: 为了计算混淆矩阵, 混淆矩阵中一共有num_class*num_class个数, 所以label中的数值也是在0与num_class**2之间. [batch_size, 512, 512]
        cout(reshape): 记录了每个类别对应的像素个数,行代表真实类别,列代表预测的类别,count矩阵中(x, y)位置的元素代表该张图片中真实类别为x,被预测为y的像素个数
        np.bincount: https://blog.csdn.net/xlinsist/article/details/51346523
        confusion_matrix: 对角线上的值的和代表分类正确的像素点个数(preb与target一致),对角线之外的其他值的和代表所有分类错误的像素的个数
    '''
    # 计算混淆矩阵
    def _generate_matrix(self, gt_image, pre_image):
        mask = (gt_image >= 0) & (gt_image < self.num_class)#ground truth中所有正确(值在[0, classe_num])的像素label的mask
        
        label = self.num_class * gt_image[mask].astype('int') + pre_image[mask] 
        # np.bincount计算了从0到n**2-1这n**2个数中每个数出现的次数，返回值形状(n, n)
        count = np.bincount(label, minlength=self.num_class**2)
        confusion_matrix = count.reshape(self.num_class, self.num_class)#21 * 21(for pascal)
        return confusion_matrix
 
 
 
# --------------------------------------------------------------------------------
 
    def add_batch(self, gt_image, pre_image):
        assert gt_image.shape == pre_image.shape
        tmp = self._generate_matrix(gt_image, pre_image)
        #矩阵相加是各个元素对应相加,即21*21的矩阵进行pixel-wise加和
        self.confusion_matrix += self._generate_matrix(gt_image, pre_image)
 
 
    def reset(self):
        self.confusion_matrix = np.zeros((self.num_class,) * 2)

其中, 利用计算出来的混淆矩阵计算MIoU的方法如下:

混淆矩阵的每一行再加上每一列，最后减去对角线上的值

'''
confusion_matrix是一个[num_classes,num_classes]的矩阵,
confusion_matrix矩阵中(x, y)位置的元素代表该张图片中真实类别为x,被预测为y的像素个数
第一个MIoU: [bach_size, 类别数]:对角线/(混淆矩阵各行加和+各列加和-对角线)
第二个MIoU: [1, 类别数]
'''
def Mean_Intersection_over_Union(self):
    MIoU = np.diag(self.confusion_matrix) / (
           np.sum(self.confusion_matrix, axis=1) +         
           np.sum(self.confusion_matrix, axis=0) -
           np.diag(self.confusion_matrix))
    MIoU = np.nanmean(MIoU) #跳过0值求mean,shape:[1, 21]
    return MIoU

GlassySky0816

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
0
评论
CV面试知识点总结--计算机视觉

交并比(Intersection over Union, IoU)代码import numpy as npdef get_IoU(pred_bbox, gt_bbox): """ return iou score between pred / gt bboxes :param pred_bbox: predict bbox coordinate :param gt_bbox: ground truth bbox coordinate :retur
复制链接

扫一扫