深度学习Caffe框架总结-CSDN博客

本文链接：https://blog.csdn.net/CSUzhangpeike/article/details/136060642

Caffe

核心模块

Syncedmemory
Blob
Layer
Net
Solver
IO
多GPU

Syncedmemory是GPU与CPU内存之间的数据交互的封装。
Blob完成操作数据的封装；模型参数和更新量的封装。
Layer表示模型层。
Net表示网络模型，包含Blob数组和Layer数组。
Solver包含Net。可继承出新的更新方法。
IO完成输入数据和初始化。

Caffe未采用符号计算的模式。

Layer

采用工厂方法创建对象。
初始化通过SetUp函数完成，内部主要函数： LayerSetUp、Reshape、SetLossWeights。
前向计算：

Forward_cpu(bottom, top);
    for (int top_id = 0; top_id < top.size(); ++top_id) {
      if (!this->loss(top_id)) { continue; }
      const int count = top[top_id]->count();
      const Dtype* data = top[top_id]->cpu_data();
      const Dtype* loss_weights = top[top_id]->cpu_diff();
      loss += caffe_cpu_dot(count, data, loss_weights);
    }

Net组装模型与数据的流程

举例，度量学习的经典模型，Siamese Model。
度量学习，找到我们关注事物间的距离的方法。

C++实现Layer的创建，创建bottom、top blob，设定学习率、权重，参数内存的优化，如共享参数。
bottom为一层的输入，top为一层的输出。
bottom为训练数据，top为目标损失函数值。

Solver的计算

Solver<Dtype>::Step函数中ApplyUpdate函数完成参数更新，分为5步

GetLearningRate(); //获取学习率
ClipGradients(); //整体裁减参数梯度
Normalize(); //Batch级梯度数据平滑
Regularize(); //计算正则项梯度
ComputeUpdateValue(); //结合梯度和权重计算更新量

DataLayer

完成数据输入。
protobuf结构，用Datum表示数据；主要针对图像任务。

Data Transformer

完成数据的预处理。
protobuf的Transformer的定义，主要完成矩阵数据的预处理。

C++

可对Transform类扩展，在caffe.proto中增加参数定义，在逻辑中增加功能。

Python

使用scikit-image完成API，图像的装载顺序是BGR，Caffe使用OpenCV，装载顺序BGR，需ChannelSwap.

扩展

参考 A Discriminative Feature Learning Approach for Deep Face Recognition
思路为增加对中间结果的控制。
Center Loss
$\frac{1}{2N}\Sigma_{i=1}^{N}\Sigma_{k=1}^{K}I(x_i\in Class_k)|x_i-c_k|_2^2$

参数通过proto文件定义
C++定义并实现CenterLossLayer

for (int i = 0; i < clusterNum; ++i)
{
    Dtype scale = -alpha * lossWeight / Dtype(center_update_count_[i]);
    caffe_scal(channels, scale, center_info_.mutable_cpu_diff() + i * channels);
}
center_info_.Update();