faster-rcnn代码阅读notes—1

最新推荐文章于 2024-03-24 14:20:21 发布

cititude

最新推荐文章于 2024-03-24 14:20:21 发布

阅读量68

点赞数 1

本文链接：https://blog.csdn.net/qq_40553793/article/details/105039129

版权

VOC数据读入

数据读入方法：ImageSets/Main/mode.txt中存放mode对应的数据编号，然后到Annotations/id.xml里读bbox的xml文件，再到JPEGImages/id.jpg里读jpg文件
id.strip() for id in open(file) 获得id列表
xml文件读入方法

import xml.etree.ElementTree as ET
filex=ET.parse(file_path)
for obj in filex.findall(x):
	print(obj.find(y).text.lower().strip())

numpy的技巧

np.stack(tuplex,axis) # 相当于先对axis做unsqueeze再concat
np.astype(np.int32/np.float32/np.uint8)
# torch不支持bool,故bool型array需先astype(np.uint8)

bbox关注的数据：
– difficult 位1则考虑
– 先找object，然后找对应的name（手动建立name索引），然后找ymin等4元祖

一些utils

nms
iou
loc2bbox与bbox2loc
bbox的描述和变换的描述有所不同
anchor的产生

anchor是在feature map上产生的，通过放大16倍到原图，同时对每个基准点（每个block的左上角），产生以该block的中心位中心的9个anchor，注意rescale时要开根号。

numpy技巧

# （N，x)与(M,x)生成(N,M,x)
x.unsqueeze(0).expand(N,M,x)
# argsort获得索引排序
idx=x.argsort(descending=False)
x=x[idx]
# 数组索引截取
idx=(idx<=thresh).nonzero()
if idx.numel()==0: pass

# np.ravel()为np.flatten()的非复制版
# np.meshgrid,对第一个向量行复制，对第二个变成列后复制

# 巧妙的使用unsqueeze加上broadcast机制

cititude

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
faster-rcnn代码阅读notes—1

VOC数据读入数据读入方法：ImageSets/Main/mode.txt中存放mode对应的数据编号，然后到Annotations/id.xml里读bbox的xml文件，再到JPEGImages/id.jpg里读jpg文件id.strip() for id in open(file) 获得id列表xml文件读入方法import xml.etree.ElementTree as ET...
复制链接

扫一扫