原代码地址:GitHub - YunYang1994/tensorflow-yolov3
1. 先贴一下原image_preporcess函数
def image_preporcess(image, target_size, gt_boxes=None):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
ih, iw = target_size
h, w, _ = image.shape
scale = min(iw/w, ih/h)
nw, nh = int(scale * w), int(scale * h)
image_resized = cv2.resize(image, (nw, nh))
image_paded = np.full(shape=[ih, iw, 3], fill_value=128.0)
dw, dh = (iw - nw) // 2, (ih-nh) // 2
image_paded[dh:nh+dh, dw:nw+dw, :] = image_resized
image_paded = image_paded / 255.
if gt_boxes is None:
return image_paded
else:
gt_boxes[:, [0, 2]] = gt_boxes[:, [0, 2]] * scale + dw
gt_boxes[:, [1, 3]] = gt_boxes[:, [1, 3]] * scale + dh
return image_paded, gt_boxes
2. 测试一下调用结果
在网上找了张风景图,将它的大小调成400(height)X800(width):
采用如下代码处理上述风景图片,把输出目标大小设置为[416,416]:
if __name__ == "__main__":
img = cv2.imread('./landscape.jpg')
print('===> img.shape: ', img.shape) # height=400, width=800
target_size = [416, 416]
processed_img = image_preporcess(img, target_size)
print('===> processed_img.shape: ', processed_img.shape)
import matplotlib.pyplot as plt
plt.imshow(processed_img)
plt.show()
打印原图和处理后图片的大小:
处理后的图片:
3.代码理解
def image_preporcess(image, target_size, gt_boxes=None):
# 采用opencv读取的图片是BGR格式的,需要对其进行转化为RGB格式
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
ih, iw = target_size
h, w, _ = image.shape
# 设目标大小为416X416,通俗来讲,这一步骤就是要将原图缩放到正好能放
# 进416X416的框里,最长边调整为416,短的一边根据原图的高宽比进行调整
scale = min(iw/w, ih/h)
nw, nh = int(scale * w), int(scale * h)
image_resized = cv2.resize(image, (nw, nh))
# 先创建一个416X416的框,所有像素用设定好的像素进行填充。
image_paded = np.full(shape=[ih, iw, 3], fill_value=128.0)
# 为了将缩放后的图片放到416X416这个框的中心位置,计算需要偏移的量dw, dh
dw, dh = (iw - nw) // 2, (ih-nh) // 2
# 在image_paded这个416X416的框里找到相应的位置,把缩放好的图片放进去
image_paded[dh:nh+dh, dw:nw+dw, :] = image_resized
# 最后对图片进行归一化处理
image_paded = image_paded / 255.