1.Introduction
最近目标检测的精度上不去,看看别人的文章,发现可以针对anchor进行参数优化,RPN网络生成的anchor数量与种类很大程度上影响着检测精度,anchor与检测目标越接近,检测精度越高。所以我们就需要统计下自己数据中目标区域的面积和长宽比。
2.Materials and methods
代码思路主要有:
(1)遍历文件夹中的xml文件
(2)定位xmin,xmax,ymin,ymax四个坐标值
(3)计算面积和长宽比,生成列表
(4)统计列表直方图
emm,没啥说的,上代码吧。
# -*- coding: utf-8 -*-
"""
Created on Sun Jan 10 21:48:48 2021
@author: YaoYee
"""
import os
import xml.etree.cElementTree as et
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm
path="C:/Users/YaoYee/Desktop/Annotations"
files=os.listdir(path)
area_list = []
ratio_list = []
def file_extension(path):
return os.path.splitext(path)[1]
for xmlFile in tqdm(files, desc='Processing'):
if not os.path.isdir(xmlFile):
if file_extension(xmlFile) == '.xml':
tree=et.parse(os.path.join(path,xmlFile))
root=tree.getroot()
filename=root.find('filename').text
# print("--Filename is", xmlFile)
for Object in root.findall('object'):
bndbox=Object.find('bndbox')
xmin=bndbox.find('xmin').text
ymin=bndbox.find('ymin').text
xmax=bndbox.find('xmax').text
ymax=bndbox.find('ymax').text
area = ( int(ymax)-int(ymin)) * (int(xmax)-int(xmin) )
area_list.append(area)
# print("Area is", area)
ratio = ( int(ymax)-int(ymin)) / (int(xmax)-int(xmin) )
ratio_list.append(ratio)
# print("Ratio is", round(ratio,2))
square_array = np.array(area_list)
square_max = np.max(square_array)
square_min = np.min(square_array)
square_mean = np.mean(square_array)
square_var = np.var(square_array)
plt.figure(1)
plt.hist(square_array,20)
plt.xlabel('Area in pixel')
plt.ylabel('Frequency of area')
plt.title('Area\n' \
+'max='+str(square_max)+', min='+str(square_min)+'\n' \
+'mean='+str(int(square_mean))+', var='+str(int(square_var))
)
ratio_array = np.array(ratio_list)
ratio_max = np.max(ratio_array)
ratio_min = np.min(ratio_array)
ratio_mean = np.mean(ratio_array)
ratio_var = np.var(ratio_array)
plt.figure(2)
plt.hist(ratio_array,20)
plt.xlabel('Ratio of length / width')
plt.ylabel('Frequency of ratio')
plt.title('Ratio\n' \
+'max='+str(round(ratio_max,2))+', min='+str(round(ratio_min,2))+'\n' \
+'mean='+str(round(ratio_mean,2))+', var='+str(round(ratio_var,2))
)
3. Results and discussion
运行下看看效果~
最后在生成的图片标题中加入了最值,均值和方差,同时还贴心的配有进度条~
4. Conclusion
好像有点入门Python了~
猜你喜欢:👇🏻
⭐【Python】如何在文件夹里批量分割图片?
⭐【Python】如何在文件夹里批量替换文本中的内容?
⭐【Python】如何在文件夹里批量修改文件名?(0001-1000)