YOLOv3小结(上)

最近把YOLO系列的论文,还有YOLOv3的源码又回顾了一下,感觉有一些是之前没注意到的,稍微总结下最近的工作。

  • V3的Pytorch源码,我看的是这个版本:

ayooshkathuria/YOLO_v3_tutorial_from_scratch

  • 这个是关于源码的一些解释说明:

How to implement a YOLO (v3) object detector from scratch in PyTorch

v3的论文相比较与v1,v2的论文,不去看源码,不得不说,有点晕…
对代码部分,个人大概可以分几个部分去理解,cfg文件的解析,创建模型,设计输入输出,Loss函数,NMS,但是这个版本没有Loss的部分

  • 有一个地方是:卷积层后后面如果跟BN层,那么卷积层是可以不用bias的

不添加bias,在BN层的处理是:
x i − x ˉ D ( x ) 2 \frac{x_i-\bar{x}}{\sqrt[2]{D(x)}} 2D(x) xixˉ
添加bias,在BN层的处理是:
x i + b − ( x ˉ + b ) D ( x ) 2 \frac{x_i+b-(\bar{x}+b)}{\sqrt[2]{D(x)}} 2D(x) xi+b(xˉ+b)

  • 其他的部分都还好,这次就记录下跟NMS相关的那一部分,具体NMS的部分,之前有记录过
def write_results(prediction, confidence, num_classes, nms_conf = 0.4):
	'''将输出的结果中 >confidence 的输出改为0'''
    conf_mask = (prediction[:,:,4] > confidence).float().unsqueeze(2)
    prediction = prediction*conf_mask
    
    '''将之前输出的center_x, center_y, H, W,改为box的左上和右下的坐标值'''
    box_corner = prediction.new(prediction.shape)
    box_corner[:,:,0] = (prediction[:,:,0] - prediction[:,:,2]/2)
    box_corner[:,:,1] = (prediction[:,:,1] - prediction[:,:,3]/2)
    box_corner[:,:,2] = (prediction[:,:,0] + prediction[:,:,2]/2) 
    box_corner[:,:,3] = (prediction[:,:,1] + prediction[:,:,3]/2)
    prediction[:,:,:4] = box_corner[:,:,:4]
    
    batch_size = prediction.size(0)

    write = False
    
	'''依次提取每一张图片'''
    for ind in range(batch_size):
        image_pred = prediction[ind]          #image Tensor

    	'''torch.max返回的两个值,第一个是最大值,第二个是最大值的索引'''
        max_conf, max_conf_score = torch.max(image_pred[:,5:5+ num_classes], 1)
        max_conf = max_conf.float().unsqueeze(1)
        max_conf_score = max_conf_score.float().unsqueeze(1)
        seq = (image_pred[:,:5], max_conf, max_conf_score)
        image_pred = torch.cat(seq, 1)
        
		'''然后将之前改为0的值去掉'''
        non_zero_ind =  (torch.nonzero(image_pred[:,4]))
        try:
            image_pred_ = image_pred[non_zero_ind.squeeze(),:].view(-1,7)
        except:
            continue
        
        if image_pred_.shape[0] == 0:
            continue        
  
        #Get the various classes detected in the image
        '''输出的所有的预测种类,然后根据类别去做NMS'''
        img_classes = unique(image_pred_[:,-1])  # -1 index holds the class index
        
        
        for cls in img_classes:
            #perform NMS

            #get the detections with one particular class
            cls_mask = image_pred_*(image_pred_[:,-1] == cls).float().unsqueeze(1)
            class_mask_ind = torch.nonzero(cls_mask[:,-2]).squeeze()
            image_pred_class = image_pred_[class_mask_ind].view(-1,7)
            
            #sort the detections such that the entry with the maximum objectness
            #confidence is at the top
            '''根据置信度排序,从最大的开始循环处理'''
            conf_sort_index = torch.sort(image_pred_class[:,4], descending = True )[1]
            image_pred_class = image_pred_class[conf_sort_index]
            idx = image_pred_class.size(0)   #Number of detections
            
            for i in range(idx):
                #Get the IOUs of all boxes that come after the one we are looking at 
                #in the loop
                try:
                    ious = bbox_iou(image_pred_class[i].unsqueeze(0), image_pred_class[i+1:])
                except ValueError:
                    break
            
                except IndexError:
                    break
            
                #Zero out all the detections that have IoU > treshhold
                '''如果IOU > 设置的阈值,就改为0,然后移除这些框'''
                iou_mask = (ious < nms_conf).float().unsqueeze(1)
                image_pred_class[i+1:] *= iou_mask       
            
                #Remove the non-zero entries
                non_zero_ind = torch.nonzero(image_pred_class[:,4]).squeeze()
                image_pred_class = image_pred_class[non_zero_ind].view(-1,7)
                
            batch_ind = image_pred_class.new(image_pred_class.size(0), 1).fill_(ind)      #Repeat the batch_id for as many detections of the class cls in the image
            seq = batch_ind, image_pred_class
            
            '''最后,把所有的输出合并'''
            if not write:
                output = torch.cat(seq,1)
                write = True
            else:
                out = torch.cat(seq,1)
                output = torch.cat((output,out))

    try:
        return output
    except:
        return 0
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值