Samples
Overview
一个检测坑洞的数据集,含有图像665张,分别用于训练(532)和测试(133)
Data Explore
annotated-images -- 所有的坑洞图像和对应的标签文件(xml),其中检测框格式为[xmin,ymin,xmax,ymax]
splits.json -- 分别用于训练和测试的图像
{
"train": ["img-110.xml", "img-578.xml", "img-455.xml", ...],
"test": ["img-565.xml", "img-498.xml", "img-143.xml", ...]
}
数据初始化
童鞋们只需要修改对应的root_path路径,即可获得每张图片的检测框标签
from xml.dom.minidom import parse
import xml.dom.minidom
import os
import json
# 获取splits.json文件内容
with open(os.path.join(root_path, "splits.json"), encoding='utf-8') as f:
line = f.readline()
all = json.loads(line)
# 获取每张图对应的检测框标签
for k in list(all.keys()):
for label_file in all[k]:
img_path = os.path.join(os.path.join(root_path, "annotated-images"), label_file.split(".")[0]+".jpg")
image_label_path = os.path.join(os.path.join(root_path, "annotated-images"), label_file)
DOMTree = xml.dom.minidom.parse(image_label_path)
collection = DOMTree.documentElement
boundingbox = collection.getElementsByTagName("object")
labels = []
for i in boundingbox:
category = i.getElementsByTagName("name")[0].childNodes[0].data
tmp = []
tmp.append(float(
[j.childNodes[0].data for j in i.getElementsByTagName("bndbox")[0].getElementsByTagName("xmin")][
0]))
tmp.append(float(
[j.childNodes[0].data for j in i.getElementsByTagName("bndbox")[0].getElementsByTagName("ymin")][
0]))
tmp.append(float(
[j.childNodes[0].data for j in i.getElementsByTagName("bndbox")[0].getElementsByTagName("xmax")][
0]))
tmp.append(float(
[j.childNodes[0].data for j in i.getElementsByTagName("bndbox")[0].getElementsByTagName("ymax")][
0]))
tmp.append(category)
labels.append(tmp)
License
Open Database License(ODbL)1.0
扫码关注后,回复 Pothole 即可获得数据集