caffe源码深入学习2：blob.hpp+blob.cpp

最新推荐文章于 2020-02-28 13:37:00 发布

jiongnima

最新推荐文章于 2020-02-28 13:37:00 发布

阅读量2.6k

点赞数 7

分类专栏： caffe

本文链接：https://blog.csdn.net/jiongnima/article/details/55223822

版权

caffe 专栏收录该内容

25 篇文章 11 订阅

订阅专栏

在caffe源码深入学习1中我们提到了caffe.cpp文件调用用户定义的solver.prototxt文件进行网络的训练，其中，网络训练的接口是train()函数，而在train()函数中，使用了Solve()这个函数接口去求解网络参数，那么，按照逻辑来说，接下来该解析solver.cpp文件，可是，事情并没有想象那么简单！如果打开solver.cpp文件，你会发现里面调用了Net相关的东西，这个时候就一头雾水了，然后就去翻找Net相关，发现Net中又和Layer相关，而Layer又和Blob是相关联的。因此，我们应该先进行blob相关的解析，弄懂caffe中数据底层构造，然后再进行Blob之上的Layer解析，再到Net与Solver，逐层往上爬，才能把caffe完整地理解清楚。

在这里先放一个链接：https://www.zhihu.com/question/27982282点击打开链接，笔者也看了其中的回答，觉得从"blob->layer->net->solver->综合->其他功能"这个顺序去阅读caffe源码不失为一种高效的学习caffe的手段，同时也推荐大家按照这样的顺序学习。

下面，我们从blob.hpp和blob.cpp说起！

按照惯例，先放注释的代码片。

首先是blob.hpp的代码片：

#ifndef CAFFE_BLOB_HPP_
#define CAFFE_BLOB_HPP_

#include <algorithm>
#include <string>
#include <vector>

#include "caffe/common.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/syncedmem.hpp"

const int kMaxBlobAxes = 32;

namespace caffe {

/**
 * @brief A wrapper around SyncedMemory holders serving as the basic
 *        computational unit through which Layer%s, Net%s, and Solver%s
 *        interact.
 *
 * TODO(dox): more thorough description.
 */
template <typename Dtype>
class Blob {
 /*Blob类首先是一个无参构造函数，初始化了一下Blob的基本信息，这四类基本信息在protected部分有所定义*/
 public:
  Blob()
       : data_(), diff_(), count_(0), capacity_(0) {}

  /// @brief Deprecated; use <code>Blob(const vector<int>& shape)</code>.
  explicit Blob(const int num, const int channels, const int height,
      const int width);//通过四个数（数量，通道数，高度，宽度）初始化Blob
  explicit Blob(const vector<int>& shape);//通过shape矢量初始化Blob

  /// @brief Deprecated; use <code>Reshape(const vector<int>& shape)</code>.
  void Reshape(const int num, const int channels, const int height,
      const int width);//通过四个数（数量，通道数，高度，宽度）改变Blob的形状
  /**
   * @brief Change the dimensions of the blob, allocating new memory if
   *        necessary.
   *
   * This function can be called both to create an initial allocation
   * of memory, and to adjust the dimensions of a top blob during Layer::Reshape
   * or Layer::Forward. When changing the size of blob, memory will only be
   * reallocated if sufficient memory does not already exist, and excess memory
   * will never be freed.
   *
   * Note that reshaping an input blob and immediately calling Net::Backward is
   * an error; either Net::Forward or Net::Reshape need to be called to
   * propagate the new input shape to higher layers.
   */
  void Reshape(const vector<int>& shape);//通过shape矢量改变Blob的形状
  void Reshape(const BlobShape& shape);//通过BlobShape类型的shape改变Blob的形状
  void ReshapeLike(const Blob& other);//复制Blob中的数据
  inline string shape_string() const {//定义内联函数输出Blob的形状
    ostringstream stream;
    for (int i = 0; i < shape_.size(); ++i) {
      stream << shape_[i] << " ";
    }
    stream << "(" << count_ << ")";
    return stream.str();
  }
  inline const vector<int>& shape() const { return shape_; }//定义一个内联函数返回shape类型的矢量
  /**
   * @brief Returns the dimension of the index-th axis (or the negative index-th
   *        axis from the end, if index is negative).
   *
   * @param index the axis index, which may be negative as it will be
   *        "canonicalized" using CanonicalAxisIndex.
   *        Dies on out of range index.
   */
  /*根据shape的索引返回维相应信息，请注意在这里支持负索引，举个栗子，shape的数据顺序是(N,C,H,W)，那么，shape(0)返回N，shape(-1)
  返回W，shape(-2)返回H*/
  inline int shape(int index) const {
    return shape_[CanonicalAxisIndex(index)];
  }
  inline int num_axes() const { return shape_.size(); }//根据shape返回Blob的维度
  inline int count() const { return count_; }//返回Blob中的数量，即按照shape的结构返回N*C*H*W

  /**
   * @brief Compute the volume of a slice; i.e., the product of dimensions
   *        among a range of axes.
   *
   * @param start_axis The first axis to include in the slice.
   *
   * @param end_axis The first axis to exclude from the slice.
   */
   /*一个自定义的count函数，返回的乘积为shape从atart_axis自身开始到end_axis之前为止的shape中各个元素的乘积，如count(0,2)返回N*C*/
  inline int count(int start_axis, int end_axis) const {
    CHECK_LE(start_axis, end_axis);
    CHECK_GE(start_axis, 0);
    CHECK_GE(end_axis, 0);
    CHECK_LE(start_axis, num_axes());
    CHECK_LE(end_axis, num_axes());
    int count = 1;
    for (int i = start_axis; i < end_axis; ++i) {
      count *= shape(i);
    }
    return count;
  }
  /**
   * @brief Compute the volume of a slice spanning from a particular first
   *        axis to the final axis.
   *
   * @param start_axis The first axis to include in the slice.
   */
   /*一个自定义的count函数，返回的乘积为shape从atart_axis自身开始到shape中最后一个元素的乘积*/
  inline int count(int start_axis) const {
    return count(start_axis, num_axes());
  }

  /**
   * @brief Returns the 'canonical' version of a (usually) user-specified axis,
   *        allowing for negative indexing (e.g., -1 for the last axis).
   *
   * @param axis_index the axis index.
   *        If 0 <= index < num_axes(), return index.
   *        If -num_axes <= index <= -1, return (num_axes() - (-index)),
   *        e.g., the last axis index (num_axes() - 1) if index == -1,
   *        the second to last if index == -2, etc.
   *        Dies on out of range index.
   */
   /*定义内联函数按照索引返回shape中的指定信息，并且在中间进行了索引为负值的检测*/
  inline int CanonicalAxisIndex(int axis_index) const {
    CHECK_GE(axis_index, -num_axes())
        << "axis " << axis_index << " out of range for " << num_axes()
        << "-D Blob with shape " << shape_string();
    CHECK_LT(axis_index, num_axes())
        << "axis " << axis_index << " out of range for " << num_axes()
        << "-D Blob with shape " << shape_string();
    if (axis_index < 0) {
      return axis_index + num_axes();//返回索引为负值的时候的正确值
    }
    return axis_index;
  }

  /*下面是四个弃用的函数，作用是返回Blob的shape中的四个数值，使用shape(0)，shape(1),shape(2),shape(3)代替*/
  /// @brief Deprecated legacy shape accessor num: use shape(0) instead.
  inline int num() const { return LegacyShape(0); }
  /// @brief Deprecated legacy shape accessor channels: use shape(1) instead.
  inline int channels() const { return LegacyShape(1); }
  /// @brief Deprecated legacy shape accessor height: use shape(2) instead.
  inline int height() const { return LegacyShape(2); }
  /// @brief Deprecated legacy shape accessor width: use shape(3) instead.
  inline int width() const { return LegacyShape(3); }
  inline int LegacyShape(int index) const {//定义内联函数，同样返回的是shape的索引值
    CHECK_LE(num_axes(), 4)
        << "Cannot use legacy accessors on Blobs with > 4 axes.";
    CHECK_LT(index, 4);
    CHECK_GE(index, -4);
    if (index >= num_axes() || index < -num_axes()) {
      // Axis is out of range, but still in [0, 3] (or [-4, -1] for reverse
      // indexing) -- this special case simulates the one-padding used to fill
      // extraneous axes of legacy blobs.
      return 1;
    }
    return shape(index);
  }
  /*下面两个内联函数是返回Blob中的偏移量，Blob数据(N,C,H,W)的偏移量位置为(n*C+c)*H+h)*W+w */
  inline int offset(const int n, const int c = 0, const int h = 0,
      const int w = 0) const {
    CHECK_GE(n, 0);
    CHECK_LE(n, num());
    CHECK_GE(channels(), 0);
    CHECK_LE(c, channels());
    CHECK_GE(height(), 0);
    CHECK_LE(h, height());
    CHECK_GE(width(), 0);
    CHECK_LE(w, width());
    return ((n * channels() + c) * height() + h) * width() + w;
  }
  /*同时也可以通过一个索引矢量返回Blob中的偏移量*/
  inline int offset(const vector<int>& indices) const {
    CHECK_LE(indices.size(), num_axes());
    int offset = 0;
    for (int i = 0; i < num_axes(); ++i) {
      offset *= shape(i);
      if (indices.size() > i) {
        CHECK_GE(indices[i], 0);
        CHECK_LT(indices[i], shape(i));
        offset += indices[i];
      }
    }
    return offset;
  }
  /**
   * @brief Copy from a source Blob.
   *
   * @param source the Blob to copy from
   * @param copy_diff if false, copy the data; if true, copy the diff
   * @param reshape if false, require this Blob to be pre-shaped to the shape
   *        of other (and die otherwise); if true, Reshape this Blob to other's
   *        shape if necessary
   */
   /*CopyFrom函数表示从source Blob中复制数据，而copy_diff和reshape参数则提供了复制内容与复制后是否处理的flag，详见.cpp文件*/
  void CopyFrom(const Blob<Dtype>& source, bool copy_diff = false,
      bool reshape = false);
  //根据shape的四个参数返回cpu上存储的data（前向传输时使用）
  inline Dtype data_at(const int n, const int c, const int h,
      const int w) const {
    return cpu_data()[offset(n, c, h, w)];
  }
  //根据shape的四个参数返回cpu上存储的diff（反向传输时使用）
  inline Dtype diff_at(const int n, const int c, const int h,
      const int w) const {
    return cpu_diff()[offset(n, c, h, w)];
  }
  //根据index返回cpu上存储的data
  inline Dtype data_at(const vector<int>& index) const {
    return cpu_data()[offset(index)];
  }
  //根据index返回cpu上存储的diff
  inline Dtype diff_at(const vector<int>& index) const {
    return cpu_diff()[offset(index)];
  }
  //直接返回Blob的数据（cpu与gpu上存储的所有数据）
  inline const shared_ptr<SyncedMemory>& data() const {
    CHECK(data_);
    return data_;
  }
  直接返回Blob的偏差（cpu与gpu上存储的所有偏差）
  inline const shared_ptr<SyncedMemory>& diff() const {
    CHECK(diff_);
    return diff_;
  }

  
  const Dtype* cpu_data() const;//只读方式获取cpu上存储的数据指针
  void set_cpu_data(Dtype* data);//手动设置cpu上存储的的数据
  const int* gpu_shape() const;//只读方式获取gpu上面的shape参数
  const Dtype* gpu_data() const;//只读方式获取gpu上存储的数据指针
  const Dtype* cpu_diff() const;//只读方式获取cpu上存储的梯度指针
  const Dtype* gpu_diff() const;//只读方式获取gpu上存储的梯度指针
  Dtype* mutable_cpu_data();//获取cpu上存储的数据指针，一般在改变数据之前调用
  Dtype* mutable_gpu_data();//变更gpu上存储的数据指针，一般在改变数据之前调用
  Dtype* mutable_cpu_diff();//变更cpu上存储的梯度指针，一般在改变梯度之前调用
  Dtype* mutable_gpu_diff();//变更gpu上存储的梯度指针，一般在改变梯度之前调用
  void Update();//更新存储的data数据
  void FromProto(const BlobProto& proto, bool reshape = true);//将数据从proto中读到Blob中
  void ToProto(BlobProto* proto, bool write_diff = false) const;//将数据从Blob返回到proto中

  /// @brief Compute the sum of absolute values (L1 norm) of the data.
  Dtype asum_data() const;//求数据的L1范数
  /// @brief Compute the sum of absolute values (L1 norm) of the diff.
  Dtype asum_diff() const;//求梯度的L1范数
  /// @brief Compute the sum of squares (L2 norm squared) of the data.
  Dtype sumsq_data() const;//求数据的L2范数
  /// @brief Compute the sum of squares (L2 norm squared) of the diff.
  Dtype sumsq_diff() const;//求梯度的L2范数

  /// @brief Scale the blob data by a constant factor.
  void scale_data(Dtype scale_factor);//以倍数变更数据
  /// @brief Scale the blob diff by a constant factor.
  void scale_diff(Dtype scale_factor);//以倍数变更梯度

  /**
   * @brief Set the data_ shared_ptr to point to the SyncedMemory holding the
   *        data_ of Blob other -- useful in Layer%s which simply perform a copy
   *        in their Forward pass.
   *
   * This deallocates the SyncedMemory holding this Blob's data_, as
   * shared_ptr calls its destructor when reset with the "=" operator.
   */
  void ShareData(const Blob& other);//从另一个Blob共享数据
  /**
   * @brief Set the diff_ shared_ptr to point to the SyncedMemory holding the
   *        diff_ of Blob other -- useful in Layer%s which simply perform a copy
   *        in their Forward pass.
   *
   * This deallocates the SyncedMemory holding this Blob's diff_, as
   * shared_ptr calls its destructor when reset with the "=" operator.
   */
  void ShareDiff(const Blob& other);//从另一个Blob共享梯度

  bool ShapeEquals(const BlobProto& other);//比较两个Blob是否相同

 protected:
  shared_ptr<SyncedMemory> data_;//Blob中的数据
  shared_ptr<SyncedMemory> diff_;//Blob中的梯度
  shared_ptr<SyncedMemory> shape_data_;//已经弃用，建议使用下一行的shape_替代
  vector<int> shape_;//表示Blob的形状(N,C,H,W)
  int count_;//Blob中数据总量，N*C*H*W
  int capacity_;//Blob的容量，因为Blob的形状会发生变化

  DISABLE_COPY_AND_ASSIGN(Blob);
};  // class Blob

}  // namespace caffe

#endif  // CAFFE_BLOB_HPP_

然后是blob.cpp的代码片

#include <climits>
#include <vector>

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/syncedmem.hpp"
#include "caffe/util/math_functions.hpp"

namespace caffe {

template <typename Dtype>//reshape函数，使用4个实数初始化Blob的shape
void Blob<Dtype>::Reshape(const int num, const int channels, const int height,
    const int width) {
  vector<int> shape(4);
  shape[0] = num;
  shape[1] = channels;
  shape[2] = height;
  shape[3] = width;
  Reshape(shape);//在这里调用下文紧接的Reshape
}

template <typename Dtype>
void Blob<Dtype>::Reshape(const vector<int>& shape) {
  CHECK_LE(shape.size(), kMaxBlobAxes);
  count_ = 1;
  shape_.resize(shape.size());
  if (!shape_data_ || shape_data_->size() < shape.size() * sizeof(int)) {
    shape_data_.reset(new SyncedMemory(shape.size() * sizeof(int)));
  }
  int* shape_data = static_cast<int*>(shape_data_->mutable_cpu_data());
  for (int i = 0; i < shape.size(); ++i) {
    CHECK_GE(shape[i], 0);
    if (count_ != 0) {
      CHECK_LE(shape[i], INT_MAX / count_) << "blob size exceeds INT_MAX";
    }
    count_ *= shape[i];
    shape_[i] = shape[i];//在这里初始化shape的数据，最终shape的数据会写到shape_和shape_data中
    shape_data[i] = shape[i];
  }
  if (count_ > capacity_) {
    capacity_ = count_;//因为Blob中间存储的数据量，因此当数据量减少时，Blob的容量上限也会发生变化。
    data_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
    diff_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
  }
}

template <typename Dtype>
void Blob<Dtype>::Reshape(const BlobShape& shape) {
  CHECK_LE(shape.dim_size(), kMaxBlobAxes);
  vector<int> shape_vec(shape.dim_size());//在这个reshape函数中，先将Blob的shape参数转化为vector<int>类型，然后再初始化
  for (int i = 0; i < shape.dim_size(); ++i) {
    shape_vec[i] = shape.dim(i);
  }
  Reshape(shape_vec);
}

template <typename Dtype>
void Blob<Dtype>::ReshapeLike(const Blob<Dtype>& other) {
  Reshape(other.shape());//这个函数实现了用其他Blob的shape来初始化
}

template <typename Dtype>
Blob<Dtype>::Blob(const int num, const int channels, const int height,
    const int width)
  // capacity_ must be initialized before calling Reshape
  : capacity_(0) {
  Reshape(num, channels, height, width);//在这个函数中，先初始化了Blob的capacity_，然后用4个实数初始化了Blob的形状
}

template <typename Dtype>
Blob<Dtype>::Blob(const vector<int>& shape)
  // capacity_ must be initialized before calling Reshape
  : capacity_(0) {
  Reshape(shape);//在这个函数中，先初始化了Blob的capacity_，然后用vector<int>& shape初始化了Blob的形状
}

template <typename Dtype>
const int* Blob<Dtype>::gpu_shape() const {
  CHECK(shape_data_);
  /*在这里执行的gpu_data的操作不是本cpp中的gpu_data，而是SyncedMemory类的gpu_data()方法,具体含义在解析SyncedMemory类说明，因为返回值类型是const...,在这里明白
是用只读方式得到gpu上面的数据指针*/
  return (const int*)shape_data_->gpu_data();
}

template <typename Dtype>
const Dtype* Blob<Dtype>::cpu_data() const {
  CHECK(data_);
  return (const Dtype*)data_->cpu_data();//SyncedMemory类的cpu_data()方法，目的是只读方式得到cpu上面的数据指针

template <typename Dtype>
void Blob<Dtype>::set_cpu_data(Dtype* data) {
  CHECK(data);
  data_->set_cpu_data(data);//SyncedMemory类的set_cpu_data()方法，目的是将访问cpu数据的指针指向data，意味着访问cpu数据可以从data开始
}

template <typename Dtype>// 只读方式得到gpu上存储的数据指针
const Dtype* Blob<Dtype>::gpu_data() const {
  CHECK(data_);
  return (const Dtype*)data_->gpu_data();
}

template <typename Dtype>//只读方式得到cpu上存储的梯度指针
const Dtype* Blob<Dtype>::cpu_diff() const {
  CHECK(diff_);
  return (const Dtype*)diff_->cpu_data();
}

template <typename Dtype>//只读方式得到gpu上存储的梯度指针
const Dtype* Blob<Dtype>::gpu_diff() const {
  CHECK(diff_);
  return (const Dtype*)diff_->gpu_data();
}

template <typename Dtype>//得到cpu上存储的数据指针，一般在改变cpu上面的数据指针之前调用，还是使用了SyncedMemory类的mutable_cpu_data()
Dtype* Blob<Dtype>::mutable_cpu_data() {
  CHECK(data_);
  return static_cast<Dtype*>(data_->mutable_cpu_data());
}

template <typename Dtype>//得到gpu上存储的数据指针，一般在改变gou上面的数据之前调用，还是使用了SyncedMemory类的mutable_gpu_data()
Dtype* Blob<Dtype>::mutable_gpu_data() {
  CHECK(data_);
  return static_cast<Dtype*>(data_->mutable_gpu_data());
}

template <typename Dtype>//得到cpu上存储的梯度指针，一般在改变cpu上面存储的梯度之前调用，还是使用了SyncedMemory类的mutable_cpu_data()，不过是diff_指针调用的
Dtype* Blob<Dtype>::mutable_cpu_diff() {
  CHECK(diff_);
  return static_cast<Dtype*>(diff_->mutable_cpu_data());
}

template <typename Dtype>//得到gpu上存储的梯度指针，一般在改变gpu上面存储的梯度之前调用，还是使用了SyncedMemory类的mutable_gpu_data()，不过是diff_指针调用的
Dtype* Blob<Dtype>::mutable_gpu_diff() {
  CHECK(diff_);
  return static_cast<Dtype*>(diff_->mutable_gpu_data());
}

template <typename Dtype>//与另外一个Blob共享数据
void Blob<Dtype>::ShareData(const Blob& other) {
  CHECK_EQ(count_, other.count());
  data_ = other.data();//直接赋值数据
}

template <typename Dtype>//与另外一个blob共享梯度
void Blob<Dtype>::ShareDiff(const Blob& other) {
  CHECK_EQ(count_, other.count());
  diff_ = other.diff();//直接赋值梯度
}

// The "update" method is used for parameter blobs in a Net, which are stored
// as Blob<float> or Blob<double> -- hence we do not define it for
// Blob<int> or Blob<unsigned int>.
template <> void Blob<unsigned int>::Update() { NOT_IMPLEMENTED; }
template <> void Blob<int>::Update() { NOT_IMPLEMENTED; }

template <typename Dtype>
void Blob<Dtype>::Update() {
  // We will perform update based on where the data is located.
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    // perform computation on CPU
    caffe_axpy<Dtype>(count_, Dtype(-1),//caffe_axpy函数存在math_function.cpp中，核心功能就是向量相加
        static_cast<const Dtype*>(diff_->cpu_data()),//在这先获得梯度值
        static_cast<Dtype*>(data_->mutable_cpu_data()));//然后进行数据的更新
    break;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    // perform computation on GPU
    caffe_gpu_axpy<Dtype>(count_, Dtype(-1),
        static_cast<const Dtype*>(diff_->gpu_data()),
        static_cast<Dtype*>(data_->mutable_gpu_data()));
#else
    NO_GPU;
#endif
    break;
  default:
    LOG(FATAL) << "Syncedmem not initialized.";
  }
}

template <> unsigned int Blob<unsigned int>::asum_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::asum_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <typename Dtype>
Dtype Blob<Dtype>::asum_data() const {
  if (!data_) { return 0; }
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    return caffe_cpu_asum(count_, cpu_data());//在这里应用caffe_cpu_asum函数进行数据的L1范数求取，caffe_cpu_asum函数存在math_function.cpp中
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
  {
    Dtype asum;
    caffe_gpu_asum(count_, gpu_data(), &asum);
    return asum;
  }
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return 0;
}

template <> unsigned int Blob<unsigned int>::asum_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::asum_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <typename Dtype>
Dtype Blob<Dtype>::asum_diff() const {
  if (!diff_) { return 0; }
  switch (diff_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    return caffe_cpu_asum(count_, cpu_diff());//梯度同之前的数据一样
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
  {
    Dtype asum;
    caffe_gpu_asum(count_, gpu_diff(), &asum);
    return asum;
  }
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << diff_->head();
  }
  return 0;
}

template <> unsigned int Blob<unsigned int>::sumsq_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::sumsq_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <typename Dtype>
Dtype Blob<Dtype>::sumsq_data() const {
  Dtype sumsq;
  const Dtype* data;
  if (!data_) { return 0; }
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    data = cpu_data();
    sumsq = caffe_cpu_dot(count_, data, data);//在这里应用caffe_cpu_dot函数进行数据的L1范数求取，caffe_cpu_dot函数存在math_function.cpp中
    break;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    data = gpu_data();
    caffe_gpu_dot(count_, data, data, &sumsq);
#else
    NO_GPU;
#endif
    break;
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return sumsq;
}

template <> unsigned int Blob<unsigned int>::sumsq_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::sumsq_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <typename Dtype>
Dtype Blob<Dtype>::sumsq_diff() const {
  Dtype sumsq;
  const Dtype* diff;
  if (!diff_) { return 0; }
  switch (diff_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    diff = cpu_diff();
    sumsq = caffe_cpu_dot(count_, diff, diff);//梯度同之前的数据一样
    break;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    diff = gpu_diff();
    caffe_gpu_dot(count_, diff, diff, &sumsq);
    break;
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return sumsq;
}

template <> void Blob<unsigned int>::scale_data(unsigned int scale_factor) {
  NOT_IMPLEMENTED;
}

template <> void Blob<int>::scale_data(int scale_factor) {
  NOT_IMPLEMENTED;
}

template <typename Dtype>
void Blob<Dtype>::scale_data(Dtype scale_factor) {
  Dtype* data;
  if (!data_) { return; }
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    data = mutable_cpu_data();
    caffe_scal(count_, scale_factor, data);//在这里使用caffe_scal进行数据的放缩
    return;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    data = mutable_gpu_data();
    caffe_gpu_scal(count_, scale_factor, data);
    return;
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
}

template <> void Blob<unsigned int>::scale_diff(unsigned int scale_factor) {
  NOT_IMPLEMENTED;
}

template <> void Blob<int>::scale_diff(int scale_factor) {
  NOT_IMPLEMENTED;
}

template <typename Dtype>
void Blob<Dtype>::scale_diff(Dtype scale_factor) {
  Dtype* diff;
  if (!diff_) { return; }
  switch (diff_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    diff = mutable_cpu_diff();
    caffe_scal(count_, scale_factor, diff);//在这里使用caffe_scal进行梯度的放缩
    return;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    diff = mutable_gpu_diff();
    caffe_gpu_scal(count_, scale_factor, diff);
    return;
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << diff_->head();
  }
}

template <typename Dtype>
bool Blob<Dtype>::ShapeEquals(const BlobProto& other) {//在这里判断两个Blob是否相等，依次验证W,H,C,N.
  if (other.has_num() || other.has_channels() ||
      other.has_height() || other.has_width()) {
    // Using deprecated 4D Blob dimensions --
    // shape is (num, channels, height, width).
    // Note: we do not use the normal Blob::num(), Blob::channels(), etc.
    // methods as these index from the beginning of the blob shape, where legacy
    // parameter blobs were indexed from the end of the blob shape (e.g., bias
    // Blob shape (1 x 1 x 1 x N), IP layer weight Blob shape (1 x 1 x M x N)).
    return shape_.size() <= 4 &&
           LegacyShape(-4) == other.num() &&
           LegacyShape(-3) == other.channels() &&
           LegacyShape(-2) == other.height() &&
           LegacyShape(-1) == other.width();
  }
  vector<int> other_shape(other.shape().dim_size());
  for (int i = 0; i < other.shape().dim_size(); ++i) {
    other_shape[i] = other.shape().dim(i);
  }
  return shape_ == other_shape;
}

/*该函数从source blob复制数据，bool类型的reshape控制需不需要用source blob的shape来变更目前的blob，
而copy_diff则判断是复制偏差还是复制数据，若copy_diff为真，则复制偏差，为假则复制数据*/
template <typename Dtype>
void Blob<Dtype>::CopyFrom(const Blob& source, bool copy_diff, bool reshape) {
  if (source.count() != count_ || source.shape() != shape_) {
    if (reshape) {
      ReshapeLike(source);
    } else {
      LOG(FATAL) << "Trying to copy blobs of different sizes.";
    }
  }
  switch (Caffe::mode()) {
  case Caffe::GPU:
    if (copy_diff) {
      caffe_copy(count_, source.gpu_diff(),
          static_cast<Dtype*>(diff_->mutable_gpu_data()));
    } else {
      caffe_copy(count_, source.gpu_data(),
          static_cast<Dtype*>(data_->mutable_gpu_data()));
    }
    break;
  case Caffe::CPU:
    if (copy_diff) {
      caffe_copy(count_, source.cpu_diff(),
          static_cast<Dtype*>(diff_->mutable_cpu_data()));
    } else {
      caffe_copy(count_, source.cpu_data(),
          static_cast<Dtype*>(data_->mutable_cpu_data()));
    }
    break;
  default:
    LOG(FATAL) << "Unknown caffe mode.";
  }
}

template <typename Dtype>
void Blob<Dtype>::FromProto(const BlobProto& proto, bool reshape) {
  if (reshape) {//FromProto函数从Proto复制了shape,数据和偏差到Blob中
    vector<int> shape;
    if (proto.has_num() || proto.has_channels() ||
        proto.has_height() || proto.has_width()) {
      // Using deprecated 4D Blob dimensions --
      // shape is (num, channels, height, width).
      shape.resize(4);
      shape[0] = proto.num();
      shape[1] = proto.channels();
      shape[2] = proto.height();
      shape[3] = proto.width();
    } else {
      shape.resize(proto.shape().dim_size());
      for (int i = 0; i < proto.shape().dim_size(); ++i) {
        shape[i] = proto.shape().dim(i);
      }
    }
    Reshape(shape);
  } else {
    CHECK(ShapeEquals(proto)) << "shape mismatch (reshape not set)";
  }
  // copy data
  Dtype* data_vec = mutable_cpu_data();
  if (proto.double_data_size() > 0) {
    CHECK_EQ(count_, proto.double_data_size());
    for (int i = 0; i < count_; ++i) {
      data_vec[i] = proto.double_data(i);
    }
  } else {
    CHECK_EQ(count_, proto.data_size());
    for (int i = 0; i < count_; ++i) {
      data_vec[i] = proto.data(i);
    }
  }
  if (proto.double_diff_size() > 0) {
    CHECK_EQ(count_, proto.double_diff_size());
    Dtype* diff_vec = mutable_cpu_diff();
    for (int i = 0; i < count_; ++i) {
      diff_vec[i] = proto.double_diff(i);
    }
  } else if (proto.diff_size() > 0) {
    CHECK_EQ(count_, proto.diff_size());
    Dtype* diff_vec = mutable_cpu_diff();
    for (int i = 0; i < count_; ++i) {
      diff_vec[i] = proto.diff(i);
    }
  }
}

template <>
void Blob<double>::ToProto(BlobProto* proto, bool write_diff) const {
  proto->clear_shape();
  for (int i = 0; i < shape_.size(); ++i) {
    proto->mutable_shape()->add_dim(shape_[i]);
  }
  proto->clear_double_data();
  proto->clear_double_diff();
  const double* data_vec = cpu_data();
  for (int i = 0; i < count_; ++i) {
    proto->add_double_data(data_vec[i]);
  }
  if (write_diff) {
    const double* diff_vec = cpu_diff();
    for (int i = 0; i < count_; ++i) {
      proto->add_double_diff(diff_vec[i]);
    }
  }
}

template <>//ToProto函数将信息从Blob中写入Proto中
void Blob<float>::ToProto(BlobProto* proto, bool write_diff) const {
  proto->clear_shape();
  for (int i = 0; i < shape_.size(); ++i) {
    proto->mutable_shape()->add_dim(shape_[i]);
  }
  proto->clear_data();
  proto->clear_diff();
  const float* data_vec = cpu_data();//调用的是上文代码中的cpu_data();
  for (int i = 0; i < count_; ++i) {
    proto->add_data(data_vec[i]);
  }
  if (write_diff) {
    const float* diff_vec = cpu_diff();//调用的是上文代码中的cpu_diff();
    for (int i = 0; i < count_; ++i) {
      proto->add_diff(diff_vec[i]);
    }
  }
}

INSTANTIATE_CLASS(Blob);
template class Blob<int>;
template class Blob<unsigned int>;

}  // namespace caffe

整体代码及相关注释如上所示，Blob整体代码给人的感觉是结构非常清晰，绝大多数函数功能一目了然，同时对一些底层的功能有很完善的封装，提供了一个网络中数据流通与存储的良好解决方案。
下面先总结一下Blob中到底包含什么，或者说Blob到底是个什么东西。
Blob作为网络中存储数据的单位，包含了5类数据。第一类叫“数据”，用data_指针表示，数据参与了网络的前向传播；第二类叫“梯度”，用diff_指针表示，梯度是网络反向传播中的重要成员；第三类是shape_，里面主要有四个参数，N（数量），C（通道数），W（宽度），H（高度），这四个数描述了Blob的外观形状，同时表示了Blob的容量；第四类叫count_，描述了Blob数据量的大小，count_值为N*C*W*H；最后一类叫capacity_，描述了Blob的容积，capacity_的值是由count_决定的。
然后我们再看一看Blob类中都有些什么功能。
首先有一些和初始化和形状改变有关的函数：构造函数和Reshape，这一类函数完成的主要功能是对Blob的形状进行初始化或者改变。Reshape函数在layer中还会出现，主要作用是每层数据的维度长宽等都不一样，因此在网络传播时，Reshape还会大派用场，在这里先不展开。

Blob()
       : data_(), diff_(), count_(0), capacity_(0) {}
 explicit Blob(const int num, const int channels, const int height,
      const int width);
  explicit Blob(const vector<int>& shape);
  void Reshape(const int num, const int channels, const int height,
      const int width);
  void Reshape(const vector<int>& shape);
  void Reshape(const BlobShape& shape);
  void ReshapeLike(const Blob& other);

其次，有若干count函数，count函数的主要作用是计算Blob的容量

inline int count() const { return count_; }
inline int count(int start_axis, int end_axis) const
inline int count(int start_axis) const

然后，就涉及到网络前传和反传的时候获得数据的一些函数

inline int offset(const vector<int>& indices) const
inline Dtype data_at(const int n, const int c, const int h,
      const int w) const
inline Dtype diff_at(const int n, const int c, const int h,
      const int w) const
inline Dtype data_at(const vector<int>& index) const
inline Dtype diff_at(const vector<int>& index) const
inline const shared_ptr<SyncedMemory>& data() const
inline const shared_ptr<SyncedMemory>& diff() const
const Dtype* cpu_data() const;
void set_cpu_data(Dtype* data);
const int* gpu_shape() const;
const Dtype* gpu_data() const;
const Dtype* cpu_diff() const;
const Dtype* gpu_diff() const;
Dtype* mutable_cpu_data();
Dtype* mutable_gpu_data();
Dtype* mutable_cpu_diff();
Dtype* mutable_gpu_diff();

offset函数主要目的是返回Blob上面具体的位置，而这个位置可以辅助data_at函数和diff_at函数返回指定位置的数据指针与梯度指针，而data()和diff()函数则直接返回了Blob的data_指针与diff_指针，set_cpu_data方法设置访问cpu的数据指针。然后，gpu_shape()函数得到gpu上存储的数据部分。接下来，还剩8个函数，有四个不带mutable，有四个带mutable，四个不带mutable的函数都是用来读的函数，只支持不改变数据的读，获得cpu/gpu上面的data/diff，而带mutable的函数支持更改cpu/gpu上的数据/梯度的读，返回data/diff指针，一般在改变cpu/gpu上面的数据/梯度之前调用。值得注意的是，这8个函数中都使用了SyncedMemory类的方法，只看Blob不是非常明了，这个留在下篇解析。
附带一个官方的例子参考，这个例子也充分说明了个SyncedMem 类同步gpu和gpu上的信息的功能：

// 假定数据在 CPU 上进行初始化，我们有一个 blob
	const Dtype* foo;
	Dtype* bar;
	foo = blob.gpu_data(); // 数据从 CPU 复制到 GPU
	foo = blob.cpu_data(); // 没有数据复制，两者都有最新的内容
	bar = blob.mutable_gpu_data(); // 没有数据复制
	// ... 一些操作 ...
	bar = blob.mutable_gpu_data(); // 仍在 GPU，没有数据复制
	foo = blob.cpu_data(); // 由于 GPU 修改了数值，数据从 GPU 复制到 CPU
	foo = blob.gpu_data(); //没有数据复制，两者都有最新的内容
	bar = blob.mutable_cpu_data(); // 依旧没有数据复制
	bar = blob.mutable_gpu_data(); //数据从 CPU 复制到 GPU
	bar = blob.mutable_cpu_data(); //数据从 GPU 复制到 CPU

然后是更新数据的Update()函数

void Update();

该函数的功能是获得梯度，并进行数据的更新。

还有一些函数支持Blob同Proto进行数据交换

void FromProto(const BlobProto& proto, bool reshape = true);
void ToProto(BlobProto* proto, bool write_diff = false) const;

最后还有一些工具函数，比如数据复制的copyFrom函数，对两个Blob的shape进行比较的ShapeEquals函数，和Blob之间共享梯度和数据的ShareData和ShareDiff函数，求数据与梯度的范数的asum_data，asum_diff，sumsq_data，sumsq_diff函数，求数据与梯度的倍数的scale_data，scale_diff函数等。
总而言之，Blob是caffe中数据底层最重要的类，在弄明白Blob之后，解析更高层的Layer和Net等就更得心应手！
最后，附加学习Blob的过程中翻阅的博客，感谢博主的指点！
楼燚(yì)航的blog：http://www.cnblogs.com/louyihang-loves-baiyan/
junmuzi的blog：http://blog.csdn.net/junmuzi
欢迎阅读笔者后续解析caffe源码的博客，各位读者朋友的支持与鼓励是我最大的动力！

written by jiong
取乎其上，得乎其中；取乎其中，得乎其下；取乎其下，则无所得矣

jiongnima

关注

7
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
caffe源码深入学习2：blob.hpp+blob.cpp

在caffe源码深入学习1中我们提到了caffe.cpp文件调用用户定义的solver.prototxt文件进行网络的训练，其中，网络训练的接口是train()函数，而在train()函数中，使用了Solve()这个函数接口去求解网络参数，那么，找逻辑来说，接下来该解析solver.cpp文件，可是，事情并没有想象那么简单！如果打开solver.cpp文件，你会发现里面调用了Net相关的东西，这个
复制链接

扫一扫