YOLOV5代码理解——类权重系数和图像权重系数
- 摘要 当训练图像的所有类个数不相同时,我们可以更改类权重, 即而达到更改图像权重的目的.然后根据图像权重新采集数据,这在图像类别不均衡的数据下尤其重要。
使用yolov5训练自己的数据集时,各类别的标签数量难免存在不平衡的问题,在训练过程中为了就减小类别不平衡问题的影响,yolov5中引入了类别权重和图像权重的设置。
类别权重
若类别权重已经更改了,每张图像包含了多个类,因此对应图像的权重也会随之而变. 默认的图像权重都为1.
计算每个图像所有类权重之和,即为图像权重
model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device)
def labels_to_class_weights(labels, nc=80):
# Get class weights (inverse frequency) from training labels
if labels[0] is None: # no labels loaded
return torch.Tensor()
labels = np.concatenate(labels, 0) # labels.shape = (866643, 5) for COCO
classes = labels[:, 0].astype(np.int) # labels = [class xywh]
weights = np.bincount(classes, minlength=nc) # occurrences per class返回0-25每类出现的次数
# Prepend gridpoint count (for uCE training)
# gpi = ((320 / 32 * np.array([1, 2, 4])) ** 2 * 3).sum() # gridpoints per image
# weights = np.hstack([gpi * len(labels) - weights.sum() * 9, weights * 9]) ** 0.5 # prepend gridpoints to start
weights[weights == 0] = 1 # replace empty bins with 1,出现出现次数为0的类,将其设为1
weights = 1 / weights # number of targets per class
weights /= weights.sum() # normalize
return torch.from_numpy(weights)
图像权重
在训练过程中,当设置参数–image_weights为True时,会计算图像采集的权重,若图像权重越大,那么该图像被采样的概率也越大。后面遍历图像时,则按照重新采集的索引dataset.indices进行计算。
parser.add_argument('--image-weights', action='store_true', default=True, help='use weighted image selection for training') # 加载图像权重
if opt.image_weights:
# Generate indices
if rank in [-1, 0]:
cw0 = model.class_weights.cpu().numpy() # ([0.64486, 0.12426, 0.23088])
cw = cw0*(1 - maps) ** 2 # class weights
iw = labels_to_image_weights(dataset.labels, nc=nc, class_weights=cw) # image weights
"""indices[7, 49, 44, 14, 29, 26, 38, 46, 5, 1, 48, 25, 44, 0, 26, 42, 13, 54,
52, 1, 1, 31, 54, 22, 12, 24, 1, 12, 25, 29, 13, 13, 12, 26, 17, 1, 48, 32, 37,
10, 57, 50, 6, 19, 42, 41, 54, 24, 48, 39, 17, 34, 51, 49, 29, 34, 1, 14]"""
dataset.indices = random.choices(range(dataset.n), weights=iw, k=dataset.n) # rand weighted idx