YOLOv3小结（上）

最新推荐文章于 2024-04-28 17:05:39 发布

云端一散仙

最新推荐文章于 2024-04-28 17:05:39 发布

阅读量383

点赞数

分类专栏：物体检测深度学习文章标签：深度学习

本文链接：https://blog.csdn.net/weixin_44347020/article/details/108532877

版权

深度学习同时被 2 个专栏收录

20 篇文章 1 订阅

订阅专栏

物体检测

13 篇文章 1 订阅

订阅专栏

最近把YOLO系列的论文，还有YOLOv3的源码又回顾了一下，感觉有一些是之前没注意到的，稍微总结下最近的工作。

V3的Pytorch源码，我看的是这个版本：

ayooshkathuria/YOLO_v3_tutorial_from_scratch

这个是关于源码的一些解释说明：

How to implement a YOLO (v3) object detector from scratch in PyTorch

解释说明中文版：
上半部分
 下半部分

v3的论文相比较与v1，v2的论文，不去看源码，不得不说，有点晕…
对代码部分，个人大概可以分几个部分去理解，cfg文件的解析，创建模型，设计输入输出，Loss函数，NMS，但是这个版本没有Loss的部分

有一个地方是：卷积层后后面如果跟BN层，那么卷积层是可以不用bias的

不添加bias，在BN层的处理是：
$\frac{x_i-\bar{x}}{\sqrt[2]{D(x)}}$
添加bias，在BN层的处理是：
$\frac{x_i+b-(\bar{x}+b)}{\sqrt[2]{D(x)}}$

其他的部分都还好，这次就记录下跟NMS相关的那一部分，具体NMS的部分，之前有记录过

def write_results(prediction, confidence, num_classes, nms_conf = 0.4):
	'''将输出的结果中 >confidence 的输出改为0'''
    conf_mask = (prediction[:,:,4] > confidence).float().unsqueeze(2)
    prediction = prediction*conf_mask
    
    '''将之前输出的center_x, center_y, H, W,改为box的左上和右下的坐标值'''
    box_corner = prediction.new(prediction.shape)
    box_corner[:,:,0] = (prediction[:,:,0] - prediction[:,:,2]/2)
    box_corner[:,:,1] = (prediction[:,:,1] - prediction[:,:,3]/2)
    box_corner[:,:,2] = (prediction[:,:,0] + prediction[:,:,2]/2) 
    box_corner[:,:,3] = (prediction[:,:,1] + prediction[:,:,3]/2)
    prediction[:,:,:4] = box_corner[:,:,:4]
    
    batch_size = prediction.size(0)

    write = False
    
	'''依次提取每一张图片'''
    for ind in range(batch_size):
        image_pred = prediction[ind]          #image Tensor

    	'''torch.max返回的两个值，第一个是最大值，第二个是最大值的索引'''
        max_conf, max_conf_score = torch.max(image_pred[:,5:5+ num_classes], 1)
        max_conf = max_conf.float().unsqueeze(1)
        max_conf_score = max_conf_score.float().unsqueeze(1)
        seq = (image_pred[:,:5], max_conf, max_conf_score)
        image_pred = torch.cat(seq, 1)
        
		'''然后将之前改为0的值去掉'''
        non_zero_ind =  (torch.nonzero(image_pred[:,4]))
        try:
            image_pred_ = image_pred[non_zero_ind.squeeze(),:].view(-1,7)
        except:
            continue
        
        if image_pred_.shape[0] == 0:
            continue        
  
        #Get the various classes detected in the image
        '''输出的所有的预测种类，然后根据类别去做NMS'''
        img_classes = unique(image_pred_[:,-1])  # -1 index holds the class index
        
        
        for cls in img_classes:
            #perform NMS

            #get the detections with one particular class
            cls_mask = image_pred_*(image_pred_[:,-1] == cls).float().unsqueeze(1)
            class_mask_ind = torch.nonzero(cls_mask[:,-2]).squeeze()
            image_pred_class = image_pred_[class_mask_ind].view(-1,7)
            
            #sort the detections such that the entry with the maximum objectness
            #confidence is at the top
            '''根据置信度排序，从最大的开始循环处理'''
            conf_sort_index = torch.sort(image_pred_class[:,4], descending = True )[1]
            image_pred_class = image_pred_class[conf_sort_index]
            idx = image_pred_class.size(0)   #Number of detections
            
            for i in range(idx):
                #Get the IOUs of all boxes that come after the one we are looking at 
                #in the loop
                try:
                    ious = bbox_iou(image_pred_class[i].unsqueeze(0), image_pred_class[i+1:])
                except ValueError:
                    break
            
                except IndexError:
                    break
            
                #Zero out all the detections that have IoU > treshhold
                '''如果IOU > 设置的阈值，就改为0，然后移除这些框'''
                iou_mask = (ious < nms_conf).float().unsqueeze(1)
                image_pred_class[i+1:] *= iou_mask       
            
                #Remove the non-zero entries
                non_zero_ind = torch.nonzero(image_pred_class[:,4]).squeeze()
                image_pred_class = image_pred_class[non_zero_ind].view(-1,7)
                
            batch_ind = image_pred_class.new(image_pred_class.size(0), 1).fill_(ind)      #Repeat the batch_id for as many detections of the class cls in the image
            seq = batch_ind, image_pred_class
            
            '''最后，把所有的输出合并'''
            if not write:
                output = torch.cat(seq,1)
                write = True
            else:
                out = torch.cat(seq,1)
                output = torch.cat((output,out))

    try:
        return output
    except:
        return 0

云端一散仙

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
YOLOv3小结（上）

最近把YOLO系列的论文，还有YOLOv3的源码又回顾了一下，感觉有一些是之前没注意到的，稍微总结下最近的工作。V3的Pytorch源码，我看的是这个版本：ayooshkathuria/YOLO_v3_tutorial_from_scratch这个是关于源码的一些解释说明：How to implement a YOLO (v3) object detector from scratch in PyTorch解释说明中文版：上半部分下半部分v3的论文相比较与v1，v2的论文，不去看
复制链接

扫一扫

专栏目录