Tensorflow2.0—Centernet网络原理及代码解析（二）- 数据生成

最新推荐文章于 2024-05-13 10:37:31 发布

进我的收藏吃灰吧~~

最新推荐文章于 2024-05-13 10:37:31 发布

阅读量573

点赞数

分类专栏： tensorflow Centernet 目标检测文章标签： python 机器学习 tensorflow 深度学习神经网络

本文链接：https://blog.csdn.net/weixin_42206075/article/details/114529949

版权

目标检测同时被 3 个专栏收录

41 篇文章 12 订阅

订阅专栏

tensorflow

40 篇文章 2 订阅

订阅专栏

Centernet

3 篇文章 0 订阅

订阅专栏

Tensorflow2.0—Centernet网络原理及代码解析（二）- 数据生成

这篇blog来看看Centernet网络中真实值是如何编码来的~
首先，在train.py中有一行是调用数据生成文件的

gen = Generator(Batch_size, lines[:num_train], lines[num_train:], input_shape, num_classes)

来看下它的参数：

Batch_size：批次大小，可自己设定。这里为了简单，只设置为2
lines[:num_train]：训练数据，[‘C:\Users\user\Desktop\centernet-tf2/VOCdevkit/VOC2007/JPEGImages/000044.jpg
1,1,370,330,8 99,101,312,213,7’, ‘C:\Users\user\Desktop\centernet-tf2/VOCdevkit/VOC2007/JPEGImages/000039.jpg
156,89,344,279,19’]
lines[num_train:]：测试数据，这里就先不设置测试数据
input_shape：输入图片大小，这里shape为[512,512,3] num_classes：类别总数量，这里为voc，所以为20

        while True:
            if train:
                # 打乱
                shuffle(self.train_lines)
                lines = self.train_lines
            else:
                shuffle(self.val_lines)
                lines = self.val_lines

先对输入数据进行随机打乱

batch_images = np.zeros((self.batch_size, self.input_size[0], self.input_size[1], self.input_size[2]), dtype=np.float32)
batch_hms = np.zeros((self.batch_size, self.output_size[0], self.output_size[1], self.num_classes), dtype=np.float32)
batch_whs = np.zeros((self.batch_size, self.max_objects, 2), dtype=np.float32)
batch_regs = np.zeros((self.batch_size, self.max_objects, 2), dtype=np.float32)
batch_reg_masks = np.zeros((self.batch_size, self.max_objects), dtype=np.float32)
batch_indices = np.zeros((self.batch_size, self.max_objects), dtype=np.float32)

进行真实框的生成，含义为：
# hm_true：热力图的真实值 (2, 128, 128, 20)
# wh_true：宽高的真实值 (2, 100, 2)
# reg_true：中心坐标偏移真实值 (2, 100, 2)
# reg_mask：真实值的mask (2, 100)
# indices：真实值对应的坐标 (2, 100)

for annotation_line in lines:  
    img,y = self.get_random_data(annotation_line,self.input_size[0:2],random=train)

然后进行数据增强操作，一起来看下最后return的img和y是什么样的吧~
img：shape为（512,512,3），这里值得注意的是，是按每张图片进行处理。
y：标签，shape为（labels_num, 5），这里为（2,5），表示该图片一共有两个真实框，其中包含着每个照片的xyxy信息。
在这里插入图片描述

if len(y)!=0:
   boxes = np.array(y[:,:4],dtype=np.float32)
   boxes[:,0] = boxes[:,0]/self.input_size[1]*self.output_size[1]
   boxes[:,1] = boxes[:,1]/self.input_size[0]*self.output_size[0]
   boxes[:,2] = boxes[:,2]/self.input_size[1]*self.output_size[1]
   boxes[:,3] = boxes[:,3]/self.input_size[0]*self.output_size[0]

然后，对真实的xyxy进行相对应的归一化，将其缩放到（128,128）所对应的尺寸中。

接着，针对每个标签信息进行处理：

for i in range(len(y)):

先进行截断操作，就是保证每个标签的xyxy都是在（0，128）之间。

h, w = bbox[3] - bbox[1], bbox[2] - bbox[0]
if h > 0 and w > 0:
   ct = np.array([(bbox[0] + bbox[2]) / 2, (bbox[1] + bbox[3]) / 2], dtype=np.float32)
   ct_int = ct.astype(np.int32)

获取每个标签的hw和中心点坐标（进行取整操作）。

热力图编码

'''获取最小高斯核半径'''
radius = gaussian_radius((math.ceil(h), #每个标签的高
			 			  math.ceil(w))) #每个标签的宽

def gaussian_radius(det_size, min_overlap=0.7):
    height, width = det_size

    a1 = 1
    b1 = (height + width)
    c1 = width * height * (1 - min_overlap) / (1 + min_overlap)
    sq1 = np.sqrt(b1 ** 2 - 4 * a1 * c1)
    r1 = (b1 + sq1) / 2

    a2 = 4
    b2 = 2 * (height + width)
    c2 = (1 - min_overlap) * width * height
    sq2 = np.sqrt(b2 ** 2 - 4 * a2 * c2)
    r2 = (b2 + sq2) / 2

    a3 = 4 * min_overlap
    b3 = -2 * min_overlap * (height + width)
    c3 = (min_overlap - 1) * width * height
    sq3 = np.sqrt(b3 ** 2 - 4 * a3 * c3)
    r3 = (b3 + sq3) / 2
    return min(r1, r2, r3)

在这里插入图片描述

转载于：https://www.cnblogs.com/silence-cho/p/13955766.html

然后，进行高斯散射核计算：

batch_hms[b, :, :, cls_id] = draw_gaussian(batch_hms[b, :, :, cls_id], #指定类别的热力图
										   ct_int, #该标签框对应的中心点坐标
										   radius #高斯核半径)

在这里插入图片描述

https://zhuanlan.zhihu.com/p/96856635?utm_source=wechat_session

将高斯核半径与sigma传入高斯散射核的计算函数中：

gaussian = gaussian2D((diameter, diameter), sigma=diameter / 6)

这里(diameter, diameter) =（69,69）,sigma =69/6 = 11.5

def gaussian2D(shape, sigma=1):
    m, n = [(ss - 1.) / 2. for ss in shape]
    y, x = np.ogrid[-m:m + 1, -n:n + 1]

    h = np.exp(-(x * x + y * y) / (2 * sigma * sigma))
    h[h < np.finfo(h.dtype).eps * h.max()] = 0
    return h

注意：
np.ogrid的作用：ogrid函数作为产生numpy数组与numpy的arange函数功能有点类似，ogrid函数产生的数组，第一个数组是以纵向产生的，即数组第二维的大小始终为1。第二个数组是以横向产生的，即数组第一维的大小始终为1。
这里，生成的y和x的shape分别为（69,1）,（1,69），里面的元素是从（-34,34）。

最后生成的gaussian的shape为（69,69），下图为gaussian：
在这里插入图片描述

将生成的高斯散射核进行填充到heatmap中：

def draw_gaussian(heatmap, center, radius, k=1):
    diameter = 2 * radius + 1
    gaussian = gaussian2D((diameter, diameter), sigma=diameter / 6)

    x, y = int(center[0]), int(center[1])

    height, width = heatmap.shape[0:2]

    left, right = min(x, radius), min(width - x, radius + 1)
    top, bottom = min(y, radius), min(height - y, radius + 1)

    masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right]
    masked_gaussian = gaussian[radius - top:radius + bottom, radius - left:radius + right]
    if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0:  # TODO debug
        np.maximum(masked_heatmap, masked_gaussian * k, out=masked_heatmap)
    return heatmap

最终得到的heatmap为该图片中某一个类别的热力图。

然后，分别求得batch_whs，batch_regs，batch_reg_masks，batch_indices：

# 计算宽高的真实值
batch_whs[b, i] = 1. * w, 1. * h
# 计算中心偏移量
batch_regs[b, i] = ct - ct_int
# 将对应的mask设置为1，用于排除多余的0
batch_reg_masks[b, i] = 1
# 表示第ct_int[1]行的第ct_int[0]个。
batch_indices[b, i] = ct_int[1] * self.output_size[0] + ct_int[0]

进我的收藏吃灰吧~~

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
打赏
0
评论
Tensorflow2.0—Centernet网络原理及代码解析（二）- 数据生成

Tensorflow2.0—Centernet网络原理及代码解析（二）- 数据生成这篇blog来看看Centernet网络中真实值是如何编码来的~首先，在train.py中有一行是调用数据生成文件的gen = Generator(Batch_size, lines[:num_train], lines[num_train:], input_shape, num_classes)来看下它的参数：Batch_size：批次大小，可自己设定。这里为了简单，只设置为2lines[:num_train
复制链接

扫一扫