caffe源码解读(10)－hinge_loss_layer.cpp

最新推荐文章于 2019-01-11 10:11:55 发布

冬后晚晴

最新推荐文章于 2019-01-11 10:11:55 发布

阅读量282

点赞数

分类专栏： caffe源码学习

本文链接：https://blog.csdn.net/weixin_37970694/article/details/79219937

版权

caffe源码学习专栏收录该内容

15 篇文章 2 订阅

订阅专栏

公式

参数：两个输入bottom[0]、bottom[1],一个输出top[0]。
bottom[0]: N*C*1*1 预测值；
bottom[1]: N*1*1*1 真实值；
p: 范数，可选 $L_{1}、L_{2}$ 范数；
$\delta \left \{ l_{n}=k \right \}$ : 示性函数，如果第 $n$ 个样本的真实标签为 $k$ ,则 $\delta \left \{ l_{n}=k \right \}=1$ ,否则为 $-1$ ；
$t_{nk}$ : bottom[0]中,第 $n$ 个样本，第 $k$ 维的预测值；

E = 1 N \sum n = 1 N \sum k = 1 K [m a x (0, 1 - δ {l n = k} t n k)] p

$E=\frac{1}{N}\sum_{n=1}^{N}\sum_{k=1}^{K}[max(0,1-\delta \left \{ l_{n}=k \right \}t_{nk})]^{p}$

代码

(1)Forward

template <typename Dtype>
void HingeLossLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
    const vector<Blob<Dtype>*>& top) {
  const Dtype* bottom_data = bottom[0]->cpu_data();//预测值tnk
  Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();
  const Dtype* label = bottom[1]->cpu_data();//真实值（不用反向传播计算）
  int num = bottom[0]->num();
  int count = bottom[0]->count();
  int dim = count / num;
　//bottom_diff=bottom_data
  caffe_copy(count, bottom_data, bottom_diff);
  for (int i = 0; i < num; ++i) {
    bottom_diff[i * dim + static_cast<int>(label[i])] *= -1;//bottom_diff=-1*tnk
  }
  for (int i = 0; i < num; ++i) {
    for (int j = 0; j < dim; ++j) {
      //bottom_diff=max(0,1-tnk)
      bottom_diff[i * dim + j] = std::max(
        Dtype(0), 1 + bottom_diff[i * dim + j]);
    }
  }
  Dtype* loss = top[0]->mutable_cpu_data();
  switch (this->layer_param_.hinge_loss_param().norm()) {
  case HingeLossParameter_Norm_L1://L1范数
  　//caffe_cpu_asum:计算bottom_diff所有元素的绝对值之和
    loss[0] = caffe_cpu_asum(count, bottom_diff) / num;//基于L1范数的hinge_loss计算公式
    break;
  case HingeLossParameter_Norm_L2://L2范数
  　//caffe_cpu_dot:计算bottom_diff*bottom_diff
    loss[0] = caffe_cpu_dot(count, bottom_diff, bottom_diff) / num;//基于L2范数的hinge_loss计算公式
    break;
  default:
    LOG(FATAL) << "Unknown Norm";
  }
}

(2)Backward
bottom[1]是label的groundtruth，不需要进行反向传播运算，只需要对bottom[0]进行反向传播运算，反向传播是损失 $E$ 对 $t$ 的偏导。以 $L_{2}$ 范数为例，求偏导为：

2 N * h i n g e * \partial h i n g e \partial t n k, 记 m a x (0, 1 - δ {l n = k} t n k) 为 h i n g e

$\frac{2}{N}*hinge*\frac{\partial hinge}{\partial t_{nk}},记max(0,1-\delta \left \{ l_{n}=k \right \}t_{nk})为hinge$
其中：

\partial h i n g e \partial t n k = {0, - 1, h i n g e = 0 h i n g e > 0}

$\frac{\partial hinge}{\partial t_{nk}}=\begin{Bmatrix} 0, &hinge=0 \\ -1,&hinge>0 \end{Bmatrix}$

template <typename Dtype>
void HingeLossLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
  if (propagate_down[1]) {//label不需要做反向传播运算
    LOG(FATAL) << this->type()
               << " Layer cannot backpropagate to label inputs.";
  }
  if (propagate_down[0]) {//对bottom[0]进行反向传输运算
    Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();
    const Dtype* label = bottom[1]->cpu_data();
    int num = bottom[0]->num();
    int count = bottom[0]->count();
    int dim = count / num;

    for (int i = 0; i < num; ++i) {
      bottom_diff[i * dim + static_cast<int>(label[i])] *= -1;
    }

    const Dtype loss_weight = top[0]->cpu_diff()[0];
    switch (this->layer_param_.hinge_loss_param().norm()) {
    case HingeLossParameter_Norm_L1:
      caffe_cpu_sign(count, bottom_diff, bottom_diff);
      caffe_scal(count, loss_weight / num, bottom_diff);
      break;
    case HingeLossParameter_Norm_L2://L2范数
      caffe_scal(count, loss_weight * 2 / num, bottom_diff);
      break;
    default:
      LOG(FATAL) << "Unknown Norm";
    }
  }
}

冬后晚晴

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
caffe源码解读(10)－hinge_loss_layer.cpp

公式参数：两个输入bottom[0]、bottom[1],一个输出top[0]。 bottom[0]: N*C*1*1 预测值； bottom[1]: N*1*1*1 真实值； p: 范数，可选L1、L2L_{1}、L_{2}范数； δ{ln=k}\delta \left \{ l_{n}=k \right \}: 示性函数，如果第nn个样本的真实标签为kk,则δ{ln=k}=1\delta
复制链接

扫一扫

专栏目录