caffe层解析之softmaxwithloss层

最新推荐文章于 2019-04-29 15:00:48 发布

Iriving_shu

最新推荐文章于 2019-04-29 15:00:48 发布

阅读量1w

点赞数 3

分类专栏： caffe学习笔记

本文链接：https://blog.csdn.net/Iriving_shu/article/details/78609409

版权

理论

caffe中的softmaxWithLoss其实是：
softmaxWithLoss = Multinomial Logistic Loss Layer + Softmax Layer
其核心公式为：
这里写图片描述
其中，其中y^为标签值，k为输入图像标签所对应的的神经元。m为输出的最大值，主要是考虑数值稳定性。

反向传播时：

对输入的zj进行求导得：

Caffe中使用

首先在Caffe中使用如下：

1 layer {
2 name: "loss"
3 type: "SoftmaxWithLoss"
4 bottom: "fc8"
5 bottom: "label"
6 top: "loss"
7 }

caffe中softmaxloss 层的参数如下：

// Message that stores parameters shared by loss layers
message LossParameter {
  // If specified, ignore instances with the given label.
  //忽略那些label
  optional int32 ignore_label = 1;
  // How to normalize the loss for loss layers that aggregate across batches,
  // spatial dimensions, or other dimensions.  Currently only implemented in
  // SoftmaxWithLoss and SigmoidCrossEntropyLoss layers.
  enum NormalizationMode {
    // Divide by the number of examples in the batch times spatial dimensions.
    // Outputs that receive the ignore label will NOT be ignored in computing
    // the normalization factor.
    //一次前向计算的loss除以所有的label数
    FULL = 0;
    // Divide by the total number of output locations that do not take the
    // ignore_label.  If ignore_label is not set, this behaves like FULL.
    //一次前向计算的loss除以所有的可用的label数
    VALID = 1;
    // Divide by the batch size.
    //除以batchsize大小，默认为batchsize大小。
    BATCH_SIZE = 2;
    // Do not normalize the loss.
    NONE = 3;
  }
  // For historical reasons, the default normalization for
  // SigmoidCrossEntropyLoss is BATCH_SIZE and *not* VALID.
  optional NormalizationMode normalization = 3 [default = VALID];
  // Deprecated.  Ignored if normalization is specified.  If normalization
  // is not specified, then setting this to false will be equivalent to
  // normalization = BATCH_SIZE to be consistent with previous behavior.
  //如果normalize==false，则normalization=BATCH_SIZE
  //如果normalize==true,则normalization=Valid
  optional bool normalize = 2;
}

首先来看一下softmaxwithloss的头文件：

#ifndef CAFFE_SOFTMAX_WITH_LOSS_LAYER_HPP_
#define CAFFE_SOFTMAX_WITH_LOSS_LAYER_HPP_

#include <vector>

#include "caffe/blob.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"

#include "caffe/layers/loss_layer.hpp"
#include "caffe/layers/softmax_layer.hpp"

namespace caffe {

/**
 * @brief Computes the multinomial logistic loss for a one-of-many
 *        classification task, passing real-valued predictions through a
 *        softmax to get a probability distribution over classes.
 *
 * This layer should be preferred over separate
 * SoftmaxLayer + MultinomialLogisticLossLayer
 * as its gradient computation is more numerically stable.
 * At test time, this layer can be replaced simply by a SoftmaxLayer.
 *
 * @param bottom input Blob vector (length 2)
 *   -# @f$ (N \times C \times H \times W) @f$
 *      the predictions @f$ x @f$, a Blob with values in
 *      @f$ [-\infty, +\infty] @f$ indicating the predicted score for each of
 *      the @f$ K = CHW @f$ classes. This layer maps these scores to a
 *      probability distribution over classes using the softmax function
 *      @f$ \hat{p}_{nk} = \exp(x_{nk}) /
 *      \left[\sum_{k'} \exp(x_{nk'})\right] @f$ (see SoftmaxLayer).
 *   -# @f$ (N \times 1 \times 1 \times 1) @f$
 *      the labels @f$ l @f$, an integer-valued Blob with values
 *      @f$ l_n \in [0, 1, 2, ..., K - 1] @f$
 *