Tensorflow2.0—Centernet网络原理及代码解析(二)- 数据生成
这篇blog来看看Centernet网络中真实值是如何编码来的~
首先,在train.py中有一行是调用数据生成文件的
gen = Generator(Batch_size, lines[:num_train], lines[num_train:], input_shape, num_classes)
来看下它的参数:
Batch_size:批次大小,可自己设定。这里为了简单,只设置为2
lines[:num_train]:训练数据,[‘C:\Users\user\Desktop\centernet-tf2/VOCdevkit/VOC2007/JPEGImages/000044.jpg
1,1,370,330,8 99,101,312,213,7’, ‘C:\Users\user\Desktop\centernet-tf2/VOCdevkit/VOC2007/JPEGImages/000039.jpg
156,89,344,279,19’]
lines[num_train:]:测试数据,这里就先不设置测试数据
input_shape:输入图片大小,这里shape为[512,512,3] num_classes:类别总数量,这里为voc,所以为20
while True:
if train:
# 打乱
shuffle(self.train_lines)
lines = self.train_lines
else:
shuffle(self.val_lines)
lines = self.val_lines
先对输入数据进行随机打乱
batch_images = np.zeros((self.batch_size, self.input_size[0], self.input_size[1], self.input_size[2]), dtype=np.float32)
batch_hms = np.zeros((self.batch_size, self.output_size[0], self.output_size[1], self.num_classes), dtype=np.float32)
batch_whs = np.zeros((self.batch_size, self.max_objects, 2), dtype=np.float32)
batch_regs = np.zeros((self.batch_size, self.max_objects, 2), dtype=np.float32)
batch_reg_masks = np.zeros((self.batch_size, self.max_objects), dtype=np.float32)
batch_indices = np.zeros((self.batch_size, self.max_objects), dtype=np.float32)
进行真实框的生成,含义为:
# hm_true:热力图的真实值 (2, 128, 128, 20)
# wh_true:宽高的真实值 (2, 100, 2)
# reg_true:中心坐标偏移真实值 (2, 100, 2)
# reg_mask:真实值的mask (2, 100)
# indices:真实值对应的坐标 (2, 100)
for annotation_line in lines:
img,y = self.get_random_data(annotation_line,self.input_size[0:2],random=train)
然后进行数据增强操作,一起来看下最后return的img和y是什么样的吧~
img:shape为(512,512,3),这里值得注意的是,是按每张图片进行处理。
y:标签,shape为(labels_num, 5),这里为(2,5),表示该图片一共有两个真实框,其中包含着每个照片的xyxy信息。
if len(y)!=0:
boxes = np.array(y[:,:4],dtype=np.float32)
boxes[:,0] = boxes[:,0]/self.input_size[1]*self.output_size[1]
boxes[:,1] = boxes[:,1]/self.input_size[0]*self.output_size[0]
boxes[:,2] = boxes[:,2]/self.input_size[1]*self.output_size[1]
boxes[:,3] = boxes[:,3]/self.input_size[0]*self.output_size[0]
然后,对真实的xyxy进行相对应的归一化,将其缩放到(128,128)所对应的尺寸中。
接着,针对每个标签信息进行处理:
for i in range(len(y)):
先进行截断操作,就是保证每个标签的xyxy都是在(0,128)之间。
h, w = bbox[3] - bbox[1], bbox[2] - bbox[0]
if h > 0 and w > 0:
ct = np.array([(bbox[0] + bbox[2]) / 2, (bbox[1] + bbox[3]) / 2], dtype=np.float32)
ct_int = ct.astype(np.int32)
获取每个标签的hw和中心点坐标(进行取整操作)。
热力图编码
'''获取最小高斯核半径'''
radius = gaussian_radius((math.ceil(h), #每个标签的高
math.ceil(w))) #每个标签的宽
def gaussian_radius(det_size, min_overlap=0.7):
height, width = det_size
a1 = 1
b1 = (height + width)
c1 = width * height * (1 - min_overlap) / (1 + min_overlap)
sq1 = np.sqrt(b1 ** 2 - 4 * a1 * c1)
r1 = (b1 + sq1) / 2
a2 = 4
b2 = 2 * (height + width)
c2 = (1 - min_overlap) * width * height
sq2 = np.sqrt(b2 ** 2 - 4 * a2 * c2)
r2 = (b2 + sq2) / 2
a3 = 4 * min_overlap
b3 = -2 * min_overlap * (height + width)
c3 = (min_overlap - 1) * width * height
sq3 = np.sqrt(b3 ** 2 - 4 * a3 * c3)
r3 = (b3 + sq3) / 2
return min(r1, r2, r3)
转载于:https://www.cnblogs.com/silence-cho/p/13955766.html
然后,进行高斯散射核计算:
batch_hms[b, :, :, cls_id] = draw_gaussian(batch_hms[b, :, :, cls_id], #指定类别的热力图
ct_int, #该标签框对应的中心点坐标
radius #高斯核半径)
https://zhuanlan.zhihu.com/p/96856635?utm_source=wechat_session
将高斯核半径与sigma传入高斯散射核的计算函数中:
gaussian = gaussian2D((diameter, diameter), sigma=diameter / 6)
这里(diameter, diameter) =(69,69),sigma =69/6 = 11.5
def gaussian2D(shape, sigma=1):
m, n = [(ss - 1.) / 2. for ss in shape]
y, x = np.ogrid[-m:m + 1, -n:n + 1]
h = np.exp(-(x * x + y * y) / (2 * sigma * sigma))
h[h < np.finfo(h.dtype).eps * h.max()] = 0
return h
注意:
np.ogrid的作用:ogrid函数作为产生numpy数组与numpy的arange函数功能有点类似,ogrid函数产生的数组,第一个数组是以纵向产生的,即数组第二维的大小始终为1。第二个数组是以横向产生的,即数组第一维的大小始终为1。
这里,生成的y和x的shape分别为(69,1),(1,69),里面的元素是从(-34,34)。
最后生成的gaussian的shape为(69,69),下图为gaussian:
将生成的高斯散射核进行填充到heatmap中:
def draw_gaussian(heatmap, center, radius, k=1):
diameter = 2 * radius + 1
gaussian = gaussian2D((diameter, diameter), sigma=diameter / 6)
x, y = int(center[0]), int(center[1])
height, width = heatmap.shape[0:2]
left, right = min(x, radius), min(width - x, radius + 1)
top, bottom = min(y, radius), min(height - y, radius + 1)
masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right]
masked_gaussian = gaussian[radius - top:radius + bottom, radius - left:radius + right]
if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0: # TODO debug
np.maximum(masked_heatmap, masked_gaussian * k, out=masked_heatmap)
return heatmap
最终得到的heatmap为该图片中某一个类别的热力图。
然后,分别求得batch_whs,batch_regs,batch_reg_masks,batch_indices:
# 计算宽高的真实值
batch_whs[b, i] = 1. * w, 1. * h
# 计算中心偏移量
batch_regs[b, i] = ct - ct_int
# 将对应的mask设置为1,用于排除多余的0
batch_reg_masks[b, i] = 1
# 表示第ct_int[1]行的第ct_int[0]个。
batch_indices[b, i] = ct_int[1] * self.output_size[0] + ct_int[0]