在测试不同的文档布局智能分析模型的过程中,部分模型只保存了coco格式的annotations,而没有将结果数据进行可视化导出,这不方便进一步的结果对比,为了方便结果可视化,整理了一份COCO格式图像数据可视化文件。
该脚本可以实现将coco格式的图像数据在图像上面打标,画出对应的bbox以及标签,并且根据结果score进行选择性呈现,具体代码如下:
# 可视化coco格式json标注中的box label到图片上
import json
import os
import numpy as np
import shutil
import cv2
'''
标签map
'''
map={
0:"_background_",
1:"Text",
2:"Title",
3:"Figure",
4:"Figure caption",
5:"Table",
6:"Table caption",
7:"Header",
8:"Footer",
9:"Reference",
10:"Equation",
}
def select(json_path, outpath, image_path):
json_file = open(json_path)
infos = json.load(json_file)
#print(infos["images"])
images = infos["images"]
annos = infos["annotations"]
assert len(images) == len(images)
for i in range(len(images)):
im_id = images[i]["id"]
im_path = image_path + "/" + images[i]["file_name"]
img = cv2.imread(im_path)
for j in range(len(annos)):
if annos[j]["image_id"] == im_id:
score=annos[j]["score"]
if(score>=0.2):
x, y, w, h = annos[j]["bbox"]
x, y, w, h = int(x), int(y), int(w), int(h)
x2, y2 = x + w, y + h
object_id = annos[j]["category_id"]
label=map.get(object_id)
font = cv2.FONT_HERSHEY_SIMPLEX
label_size = cv2.getTextSize(label, font, 1, 2)
img = cv2.rectangle(img, (x, y), (x2, y2), (255, 0, 0), thickness=2)
text_origin = np.array([x, y - label_size[0][1]])
img= cv2.rectangle(img, tuple(text_origin), tuple(text_origin + label_size[0]), (255, 0, 0), thickness=2)
cv2.putText(img, label, (x, y - 5), font, 1, (255, 0, 0), 2)
img_name = outpath + "/" + images[i]["file_name"]
cv2.imwrite(img_name, img)
# continue
print(i)
if __name__ == "__main__":
train_json = r"/mnt/md0/unilm-master/layoutlmv3/examples/object_detection/path/to/data/PubLayNet/publaynet/test.json"
train_path = r"/mnt/md0/zhangfanhao/unilm-master/layoutlmv3/examples/object_detection/path/to/data/PubLayNet/publaynet/test/"
visual_output = r"/mnt/md0/unilm-master/layoutlmv3/examples/object_detection/path/to/data/result"
select(train_json, visual_output, train_path)
通过上述代码,可以将图像数据进行可视化。
由于模型会将所有框图进行标注,这会导致多个框图出现重叠的问题,不方便于可视化展示,使用score进行简单的筛选可以减少框图的重叠。
但是也存在更进一步的优化方案,如使用IOU或者NMS方法进行去除重叠。