目标检测 pytorch复现Yolov2目标检测项目
- YOVLOV2改进:
- 1. 用Kmeans获取先验框的宽高
- 2. 引入anchor。BS*7*7*2Boxes*30-->BS*13*13*5Boxes*(4+1+cls)
- 3. 提高输入数据分辨率。224-->448
- 4. 采用darknet-19主干网络。
- 5. 特征融合(passthrough)。BS*26*26*512-->BS*13*13*256-->cat BS*13*13*1024-->BS*13*13*1280
- 6. 加入BN。去掉fc、dropout,添加bn+relu,提高2%mAP。
- 7. 预测中心点坐标偏移。解码函数改变。
- 8. 多尺度输入训练网络。[320,352,384,448,480,512,544,576,608] 10/pre epoch
YOLOv2损失函数loss
代码:
def yolo_loss(args,
anchors,
num_classes,
rescore_confidence=False,
print_loss=False):
"""YOLO localization loss function.
Parameters
----------
yolo_output : tensor
Final convolutional layer features.
true_boxes : tensor
Ground truth boxes tensor with shape [batch, num_true_boxes, 5]
containing box x_center, y_center, width, height, and class.
detectors_mask : array
0/1 mask for detector positions where there is a matching ground truth.
matching_true_boxes : array
Corresponding ground truth boxes for positive detector positions.
Already adjusted for conv height and width.
anchors : tensor
Anchor boxes for model.
num_classes : int
Number of object classes.
rescore_confidence : bool, default=False
If true then set confidence target to IOU of best predicted box with
the closest matching ground truth box.
print_loss : bool, default=False
If True then use a tf.Print() to print the loss components.
Returns
-------
mean_loss : float
mean localization loss across minibatch
"""
(yolo_output, true_boxes, detectors_mask, matching_true_boxes) = args
num_anchors = len(anchors)
object_scale = 5
no_object_scale = 1
class_scale = 1
coordinates_scale = 1
pred_xy, pred_wh, pred_confidence, pred_class_prob = yolo_head(
yolo_output, anchors, num_classes)
# Unadjusted box predictions for loss.
# TODO: Remove extra computation shared with yolo_head.
yolo_output_shape = K.shape(yolo_output)
feats = K.reshape(yolo_output, [
-1, yolo_output_shape[1], yolo_output_shape[2], num_anchors,
num_classes + 5
])
pred_boxes = K.concatenate(
(K.sigmoid(feats[..., 0:2]), feats[..., 2:4]), axis=-1)
# TODO: Adjust predictions by image width/height for non-square images?
# IOUs may be off due to different aspect ratio.
# Expand pred x,y,w,h to allow comparison with ground truth.
# batch, conv_height, conv_width, num_anchors, num_true_boxes, box_params
pred_xy = K.expand_dims(pred_xy, 4)
pred_wh = K.expand_dims(pred_wh, 4)
pred_wh_half = pred_wh / 2.
pred_mins = pred_xy - pred_wh_half
pred_maxes = pred_xy + pred_wh_half
true_boxes_shape = K.shape(true_boxes)
# batch, conv_height, conv_width, num_anchors, num_true_boxes, box_params
true_boxes = K.reshape(true_boxes, [
true_boxes_shape[0], 1, 1, 1, true_boxes_shape[1], true_boxes_shape[2]
])
true_xy = true_boxes[..., 0:2]
true_wh = true_boxes[..., 2:4]
# Find IOU of each predicted box with each ground truth box.
true_wh_half = true_wh / 2.
true_mins = true_xy - true_wh_half
true_maxes = true_xy + true_wh_half
intersect_mins = K.maximum(pred_mins, true_mins)
intersect_maxes = K.minimum(pred_maxes, true_maxes)
intersect_wh = K.maximum(intersect_maxes - intersect_mins, 0.)
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
pred_areas = pred_wh[..., 0] * pred_wh[..., 1]
true_areas = true_wh[..., 0] * true_wh[..., 1]
union_areas = pred_areas + true_areas - intersect_areas
iou_scores = intersect_areas / union_areas
# Best IOUs for each location.
best_ious = K.max(iou_scores, axis=4) # Best IOU scores.
best_ious = K.expand_dims(best_ious)
# A detector has found an object if IOU > thresh for some true box.
object_detections = K.cast(best_ious > 0.6, K.dtype(best_ious))
# TODO: Darknet region training includes extra coordinate loss for early
# training steps to encourage predictions to match anchor priors.
# Determine confidence weights from object and no_object weights.
# NOTE: YOLO does not use binary cross-entropy here.
# 计算不存在目标时的损失
no_object_weights = (no_object_scale * (1 - object_detections) *
(1 - detectors_mask))
no_objects_loss = no_object_weights * K.square(-pred_confidence)
if rescore_confidence:
objects_loss = (object_scale * detectors_mask *
K.square(best_ious - pred_confidence))
else:
objects_loss = (object_scale * detectors_mask *
K.square(1 - pred_confidence))
confidence_loss = objects_loss + no_objects_loss
# 类别损失
# Classification loss for matching detections.
# NOTE: YOLO does not use categorical cross-entropy loss here.
matching_classes = K.cast(matching_true_boxes[..., 4], 'int32')
matching_classes = K.one_hot(matching_classes, num_classes)
classification_loss = (class_scale * detectors_mask *
K.square(matching_classes - pred_class_prob))
# 坐标损失
# Coordinate loss for matching detection boxes.
matching_boxes = matching_true_boxes[..., 0:4]
coordinates_loss = (coordinates_scale * detectors_mask *
K.square(matching_boxes - pred_boxes))
confidence_loss_sum = K.sum(confidence_loss)
classification_loss_sum = K.sum(classification_loss)
coordinates_loss_sum = K.sum(coordinates_loss)
total_loss = 0.5 * (
confidence_loss_sum + classification_loss_sum + coordinates_loss_sum)
if print_loss:
total_loss = tf.Print(
total_loss, [
total_loss, confidence_loss_sum, classification_loss_sum,
coordinates_loss_sum
],
message='yolo_loss, conf_loss, class_loss, box_coord_loss:')
return total_loss
yolov2数据输出