学习笔记: 源码 multinomial_logistic_loss_layer.cpp 略晓

最新推荐文章于 2024-01-09 11:59:25 发布

smallplum123

最新推荐文章于 2024-01-09 11:59:25 发布

阅读量2.2k

点赞数 1

分类专栏： Caffe&TF&Mxnet

本文链接：https://blog.csdn.net/smallplum123/article/details/72330796

版权

Caffe&TF&Mxnet 专栏收录该内容

19 篇文章 0 订阅

订阅专栏

  MultinomialLogisticLossLayer 

  对数损失函数： L = -log(P(Y|X)) 

softmax 的损失函数如下：

  在处理分类问题的时候，当前一层输出了每一类的概率，那么则可以利用 MultinomialLogisticLossLayer 来计算 loss。 

  1. forward() 

template <typename Dtype>
void MultinomialLogisticLossLayer<Dtype>::Forward_cpu(
...
  Dtype loss = 0;
  for (int i = 0; i < num; ++i) {
    int label = static_cast<int>(bottom_label[i]);
    Dtype prob = std::max(
        bottom_data[i * dim + label], Dtype(kLOG_THRESHOLD)); //kLOG_THRESHOLD = 1e-20;
    loss -= log(prob);  // 累加
  }
  top[0]->mutable_cpu_data()[0] = loss / num;  // 再求平均
}

  2. backward() 

template <typename Dtype>
void MultinomialLogisticLossLayer<Dtype>::Backward_cpu(
    const vector<Blob<Dtype>*>& top, const vector<bool>& propagate_down,
    const vector<Blob<Dtype>*>& bottom) {
...
    caffe_set(bottom[0]->count(), Dtype(0), bottom_diff);  //除了 label 对应的bottom 项，其余bottom_diff 为 0  
    const Dtype scale = - top[0]->cpu_diff()[0] / num; // 此处top_diff = loss weight = 1, 则scale = -1/N
    for (int i = 0; i < num; ++i) {
      int label = static_cast<int>(bottom_label[i]);
      Dtype prob = std::max(
          bottom_data[i * dim + label], Dtype(kLOG_THRESHOLD));
      bottom_diff[i * dim + label] = scale / prob;
} } }

  假设输入为a, 输出为 z. 前向和后向公式分别如下： 

  其中 有N个样本，K个类别。 

smallplum123

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录