Caffe Smooth_L1_Loss_Layer 问答



rbg答:As sigma -> inf the loss approaches L1 (abs) loss. Setting sigma = 3, makes the transition point from quadratic to linear happen at |x| <= 1 / 3**2 (closer to the origin). The reason for doing this is because the RPN bbox regression targets are not normalized by their stdev (unlike in Fast R-CNN), because the statistics of the targets are changing constantly throughout learning. In a future update I may simply replace smooth L1 with (hard) L1 which I believe will likely work as well and be simpler (no sigma, etc.).

问:为什么Smooth_L1_loss对target box(bottom[1])也进行反向传播?

rbg答:Smooth L1 loss can be used in cases where you do want to bprop to both inputs (e.g., in a “siamese” network). In the case of Fast R-CNN, we don’t need derivatives for the bbox regression labels, but the layer is more general than its use in Fast R-CNN.