Caffe中的数据容器Blob

最新推荐文章于 2023-07-17 15:03:40 发布

lzcn

最新推荐文章于 2023-07-17 15:03:40 发布

阅读量548

点赞数

分类专栏： Caffe源代码阅读笔记

本文链接：https://blog.csdn.net/u014223635/article/details/78994926

版权

Caffe源代码阅读笔记专栏收录该内容

3 篇文章 0 订阅

订阅专栏

简介

考虑一个典型的深度学习框架,假设我们有 $N$ 个训练样本,
$(\boldsymbol{x}^{(i)},y^{(i)}),i\in \mathbb{N}$ .
目标方程 $J(\boldsymbol{\theta})$ 衡量模型对于每个样本的损失和, 即:

J (θ) = 1 N \sum i = 1 N L ((x (i), y (i)), θ))

$J(\boldsymbol{\theta}) = \frac{1}{N}\sum_{i=1}^N {L((\boldsymbol{x}^{(i)},y^{(i)}),\boldsymbol{\theta}))}$

由于训练样本的数据很大, 通常会采用 minibatch stochastic gradient descent 去优化目标方程. minibatch 指的是训练样本的一个大小为 $m$ 的子集, 即:

B = {x (1), \dots, x (m)}

$\mathbb{B}=\{\boldsymbol{x}^{(1)},\ldots,\boldsymbol{x}^{(m)}\}$

然后用 mini-batch 的梯度信息作为对估计目标方程梯度的估计:

g = 1 m \sum i = 1 m \nabla θ L ((x (i), y (i)), θ))

$\boldsymbol{g} = \frac{1}{m}\sum_{i=1}^{m} {\nabla_{\boldsymbol{\theta}}L((\boldsymbol{x}^{(i)},y^{(i)}),\boldsymbol{\theta}))}$

从而更新参数:

θ \leftarrow θ - α g

${\boldsymbol{\theta}} \leftarrow \boldsymbol{\theta} - \alpha\boldsymbol{g}$

对于每一个数据 $\boldsymbol{x}$ , 我们可能同时还要保存它的梯度信息 $d\boldsymbol{x}$ 来计算梯度的后向传递. 另外,这些数据需要同时保存到GPU的内存用, 进行并行运算. 不管这些数据是网络层的中间数据或是网络层的参数, 都满足这些条件.

所以在 Caffe 中Blob实现了对此类数据的抽象. 在Blob存储的类成员中有两类数据,data_:参数信息和diff_:梯度信息. 实际中,Blob保存的可以是单个数据点,也可以是一个 batch 的数据.

Caffe中对Blob的实现

Blob 作为对数据的抽象, 不仅要对数据的结构进行抽象,例如将数据看成多维的张量,并且每个对象保存两类数据. 同时也要对GPU和CPU之间的数据通信进行抽象, 封装这些接口函数.

Blob的数据成员

Blob 通过类 SynecdMemory 来保存数据, 并提供CPU和GPU数据同步的统一接口:

class SyncedMemory {
public:
  // 获取保存在cpu/gpu中的常量数据
  const void* cpu_data();
  const void* gpu_data();
  // 获得保存在cpu/gpu的可修改的数据
  void* mutable_cpu_data();
  void* mutable_gpu_data();
  void set_cpu_data(void* data);
  void set_gpu_data(void* data);
  size_t size() { return size_; }
}

由于 SynecdMemory 提供了GPU和CPU之间的数据通信, 它作为一个 Blob 对象的基本成员:

protected:
  shared_ptr<SynecdMemory> data_;
  shared_ptr<SynecdMemory> diff_;
  shared_ptr<SynecdMemory> shape_data_;
  vector<int> shape_;
  // 实际的数据大小
  int count_;
  // 该对象能容纳的数据大小
  int capacity_;

Blob 类提供了对梯度和数据信息的接口, 同时管理该对象当前管理的数据大小以及最大可容纳的数据总量.

类的接口函数

Blob 类禁止了拷贝构造函数和赋值操作, 另外大多数 Caffe 中的类都做了同样的限制. Blob 自己提供了复制函数, 同时给出了基本的接口函数:

template <typename Dtype>
class Blob {
 public:
  Blob() : data_(), diff_(), count_(0), capacity_(0) {}
  explicit Blob(const vector<int>& shape);
  // Reshape
  void Reshape(const vector<int>& shape);
  void Reshape(const BlobShape& shape);
  void ReshapeLike(const Blob& other);
  // 形状信息
  inline string shape_string() const;
  inline const vector<int>& shape() const;
  inline int shape(int index) const;
  inline int num_axes() const;
  inline int count() const;
  inline int count(int start_axis, int end_axis) const;
  inline int count(int start_axis) const;
  inline int CanonicalAxisIndex(int axis_index) const;
  inline int LegacyShape(int index) const;
  // 对象之间的拷贝
  void CopyFrom(const Blob<Dtype>& source, bool copy_diff = false,
      bool reshape = false);
  // 读取数据
  inline int offset(const int n, const int c = 0,
    const int h = 0, const int w = 0) const;
  inline int offset(const vector<int>& indices) const;
  inline Dtype data_at(const int n, const int c, const int h,
      const int w) const;
  inline Dtype diff_at(const int n, const int c, const int h,
      const int w) const;
  inline Dtype data_at(const vector<int>& index) const;
  inline Dtype diff_at(const vector<int>& index) const;
  inline const shared_ptr<SyncedMemory>& data() const;
  inline const shared_ptr<SyncedMemory>& diff() const;
  const Dtype* cpu_data() const;
  void set_cpu_data(Dtype* data);
  const int* gpu_shape() const;
  const Dtype* gpu_data() const;
  const Dtype* cpu_diff() const;
  const Dtype* gpu_diff() const;
  Dtype* mutable_cpu_data();
  Dtype* mutable_gpu_data();
  Dtype* mutable_cpu_diff();
  Dtype* mutable_gpu_diff();
  // 更新参数
  void Update();
  // Proto之间的读写
  void FromProto(const BlobProto& proto, bool reshape = true);
  void ToProto(BlobProto* proto, bool write_diff = false) const;
  // 简单的数学运算
  Dtype asum_data() const;
  Dtype asum_diff() const;
  Dtype sumsq_data() const;
  Dtype sumsq_diff() const;
  void scale_data(Dtype scale_factor);
  void scale_diff(Dtype scale_factor);
  // 数据共享
  void ShareData(const Blob& other);
  void ShareDiff(const Blob& other);
  // 比较两个Blob对象之间的形状
  bool ShapeEquals(const BlobProto& other);

构造函数

因为 Blob 禁止了拷贝构造函数, 要么实例化一个空的 Blob 对象, 要么给定一个shape信息来实例化一个对象.

template <typename Dtype>
class Blob {
 public:
  Blob(): data_(), diff_(), count_(0), capacity_(0) {};
  explicit Blob (const vector< int > &shape);
}
// 构造函数的实现
Blob<Dtype>::Blob(const vector<int>& shape) : capacity_(0) {
  Reshape(shape);
}

在实例化一个Blob对象时, 除了初始化类的数据成员, 仅仅做了 Reshape 操作.

Reshape

Reshape函数将当前的Blob对象重塑到给定的大小.

void Blob<Dtype>::Reshape(const vector<int>& shape){
  // 头文件中定义了const int kMaxBlobAxes = 32;
  CHECK_LE(shape.size(), kMaxBlobAxes);
  count_ = 1;
  shape_.resize(shape.size());
  // 如果shape_data_的大小不满足或者未初始化
  if (!shape_data_ || shape_data_->size() < shape.size() *  sizeof(int)) {
    shape_data_.reset(new SyncedMemory(shape.size() * sizeof(int)));
  }
  // 更新 shape_data_, 只要更新任意一端的数据即可
  int* shape_data = static_cast<int*>(shape_data_->mutable_cpu_data());
  for (int i = 0; i < shape.size(); ++i) {
    CHECK_GE(shape[i], 0);
    if (count_ != 0) {
      CHECK_LE(shape[i], INT_MAX / count_) << "blob size exceeds  INT_MAX";
    }
    count_ *= shape[i];
    shape_[i] = shape[i];
    shape_data[i] = shape[i];
  }
  // 如果capacity_小于数据需要的内存大小,则重新为data_和diff_分配内存,
  // reset会自动释放之前的data_和 diff_.
  if (count_ > capacity_) {
    capacity_ = count_;
    data_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
    diff_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
  }
}

Note: shared_ptr<T>是boost库中的智能指针模板.

通过BlobShape进行reshape

类BlobShape的定义在caffe.proto中,
{% highlight proto%}
package caffe;
message BlobShape {
repeated int64 dim = 1 [packed = true];
}

从而BlobShape也可以作为`Reshape`的参数:
```c++
void Blob<Dtype>::Reshape(const BlobShape& shape) {
  CHECK_LE(shape.dim_size(), kMaxBlobAxes);
  vector<int> shape_vec(shape.dim_size());
  for (int i = 0; i < shape.dim_size(); ++i) {
    shape_vec[i] = shape.dim(i);
  }
  Reshape(shape_vec);
}




<div class="se-preview-section-delimiter"></div>

通过其他Blob对象进行reshape

ReshapeLike用另外一个Blob对象的shape_数据进行 reshape.

public:
  void Blob<Dtype>::ReshapeLike(const Blob<Dtype>& other) {
    Reshape(other.shape());
  }




<div class="se-preview-section-delimiter"></div>

输出形状信息

Blob 提供了一个简单的内联函数, 来输出形状信息:

public:
  inline string shape_string() const {
    ostringstream stream;
    for (int i = 0; i < shape_.size(); ++i) {
      stream << shape_[i] << " ";
    }
    stream << "(" << count_ << ")";
    return stream.str();
  }




<div class="se-preview-section-delimiter"></div>

访问Blob的成员接口

访问Blob的shape信息

public:
  // 返回Blob的shape_成员
  inline const vector<int>& shape() const { return shape_; }
  // 返回Blob的shape_[index],其中index的范围可以是[-shape_.size(),shape_.size()-1].
  inline int shape(int index) const { return shape_[CanonicalAxisIndex(index)]; }
  // 返回保存在GPU中的形状信息
  const int* gpu_shape() const;
  const int* Blob<Dtype>::gpu_shape() const {
    CHECK(shape_data_);
    return (const int*)shape_data_->gpu_data();
  }
  // 返回Blob的维数,即shanpe_.size().
  inline int num_axes() const { return shape_.size(); }
  inline int CanonicalAxisIndex(int axis_index) const {
    CHECK_GE(axis_index, -num_axes())
        << "axis " << axis_index << " out of range for " << num_axes()
        << "-D Blob with shape " << shape_string();
    CHECK_LT(axis_index, num_axes())
        << "axis " << axis_index << " out of range for " << num_axes()
        << "-D Blob with shape " << shape_string();
    if (axis_index < 0) {
      return axis_index + num_axes();
    }
    return axis_index;
  }




<div class="se-preview-section-delimiter"></div>

CanonicalAxisIndex(index) 用于将[-shape_.size(),shape_.size()-1] 映射回[0,shape_.size()-1]范围.

判断两个Blob的形状是否相等,另外一个blob是BlobProto类型.通常用来和保存在protobbuf中的blob比较.

bool ShapeEquals(const BlobProto& other);




<div class="se-preview-section-delimiter"></div>

访问元素的大小

Blob的体积（元素总量）记录在count_成员.其提供了若干成员函数来访问关于count_的一些信息.
这些函数包括返回总个数count_以及特定坐标之间的元素个数.

public:
  // 返回Blob的元素的数量.
  inline int count() const { return count_; }
  // 返回从 start_axis 到 end_axis(不包换) 的元素数量
  inline int count(int start_axis, int end_axis) const {
    // 范围检查
    CHECK_LE(start_axis, end_axis);
    CHECK_GE(start_axis, 0);
    CHECK_GE(end_axis, 0);
    CHECK_LE(start_axis, num_axes());
    CHECK_LE(end_axis, num_axes());
    int count = 1;
    for (int i = start_axis; i < end_axis; ++i) {
      count *= shape(i);
    }
    return count;
  }
  // 返回从 start_axis 到 num_axes() 的元素数量
  inline int count(int start_axis) const {
    return count(start_axis, num_axes());
  }




<div class="se-preview-section-delimiter"></div>

访问data_和diff_数据

访问Blob中的单个元素.Blob提供了关于数据统一的界面.一般情况,数据保存成一个4D张量的形式.这个张量的大小是:

n u m b e r N \times c h a n n e l K \times h e i g h t H \times w i d t h W

${\rm number }N\times{\rm channel}K\times {\rm height\ }H\times {\rm width\ }W$

数据以行主序存储,即在 $n,k,h,w$ 处的数据为:

((n \times K + k) \times H + h) \times W + w

$((n\times K +k)\times H +h )\times W +w$
那么对于这类训练数据集,首先是关于计算给定坐标的在内存中偏移量的函数:

计算数据的偏移量
计算偏移量是访问特定位置数据的基础. 坐标信息可以是分别给出 $n,k,h,w$ ,也可以是有向量给出:

public:
  inline int offset(const int n, const int c = 0, const int h = 0,const int w = 0) const {
    // 范围检查
    CHECK_GE(n, 0);
    CHECK_LE(n, num());
    CHECK_GE(channels(), 0);
    CHECK_LE(c, channels());
    CHECK_GE(height(), 0);
    CHECK_LE(h, height());
    CHECK_GE(width(), 0);
    CHECK_LE(w, width());
    return ((n * channels() + c) * height() + h) * width() + w;
  }
  inline int offset(const vector<int>& indices) const {
    // 范围检查
    CHECK_LE(indices.size(), num_axes());
    int offset = 0;
    for (int i = 0; i < num_axes(); ++i) {
      offset *= shape(i);
      if (indices.size() > i) {
        // 范围检查
        CHECK_GE(indices[i], 0);
        CHECK_LT(indices[i], shape(i));
        offset += indices[i];
      }
    }
    return offset;
  }




<div class="se-preview-section-delimiter"></div>

访问data_和diff_
data_和’diff_’数据的类型是shared_prt<SynecdMemory>, 其作用是管理CPU和GPU中数据的分配和同步. 因此访问CPU和GPU中的数据通过shared_prt<SynecdMemory>提供的接口实现.

public:
  // Return data_
  inline const shared_ptr<SyncedMemory>& data() const {
    CHECK(data_);
    return data_;
  }
  // Return diff_
  inline const shared_ptr<SyncedMemory>& diff() const {
    CHECK(diff_);
    return diff_;
  }




<div class="se-preview-section-delimiter"></div>

Note SynecdMemory 是caffe中一个数据类,用于同步CPU和GPU之间的数据.data_->cpu_data()在返回数据之前会先对GPU中的数据进行同步,然后再返回CPU中的数据,详见对Caffe::SynecedMemory的总结.

通过这两个函数,Blob实现了对CPU和GPU端数据的访问. 并且Blob通过两种方式访问数据,一是访问常量数据,而是访问可以改变的数据.访问常量数据通过:
- const Dtype* cpu_data() const;
- const Dtype* cpu_diff() const;
- const Dtype* gpu_data() const;
- const Dtype* gpu_diff() const;

访问可以可变数据通过:
- Dtype* mutable_cpu_data();
- Dtype* mutable_cpu_diff();
- Dtype* mutable_gpu_data();
- Dtype* mutable_gpu_diff();

下面给出访问常量参数数据的实现,梯度信息的访问与数据信息的访问相似:

template <typename Dtype>
const Dtype* Blob<Dtype>::cpu_data() const {
  CHECK(data_);
  return (const Dtype*)data_->cpu_data();
}
template <typename Dtype>
const Dtype* Blob<Dtype>::gpu_data() const {
  CHECK(data_);
  return (const Dtype*)data_->gpu_data();
}




<div class="se-preview-section-delimiter"></div>

Note C++11中为了更容易地使用动态内存,提供了两种智能指针(smart pointer):shared_ptr和unique_prt.前者允许多个指针指向同一个对象,后者独占所指的对象.智能指针的行为类似于普通指针,区别在于智能指针提供自动释放对象的功能,无需手动进行free或者deleta操作.

访问非常量数据的与访问常量数据的行为类似:

Dtype* Blob<Dtype>::mutable_cpu_data() {
  CHECK(data_);
  return static_cast<Dtype*>(data_->mutable_cpu_data());
}
Dtype* Blob<Dtype>::mutable_gpu_data() {
  CHECK(data_);
  return static_cast<Dtype*>(data_->mutable_gpu_data());
}




<div class="se-preview-section-delimiter"></div>

访问给定位置的 data_和diff_
结合offset函数,Blob提供了访问data_和diff_内的给定位置的数据.同样坐标信息可以是分别给出 $n,k,h,w$ ,也可以是有向量给出.

// Return data_[offset] at cpu
inline Dtype data_at(const vector<int>& index) const {
  return cpu_data()[offset(index)];
}
inline Dtype data_at(const int n, const int c, const int h, const int w) const {
  return cpu_data()[offset(n, c, h, w)];
}
// Return diff_[offset] at cpu
inline Dtype diff_at(const int n, const int c, const int h, const int w) const {
  return cpu_diff()[offset(n, c, h, w)];
}
inline Dtype diff_at(const vector<int>& index) const {
  return cpu_diff()[offset(index)];
}




<div class="se-preview-section-delimiter"></div>

改变data_数据

data_数据用于保存神经网络参数以及训练数据.那么就需要提供改变data_数据的成员函数以便赋值和更新.

public:
  void set_cpu_data(Dtype* data);




<div class="se-preview-section-delimiter"></div>

只要在CPU端设置好数据,GPU端的数据通过管理数据的SyncedMemory完成同步. 在cuda编程中,GPU端的数据也是从CPU端拷贝的.

// set data_ to CPU
template <typename Dtype>
void Blob<Dtype>::set_cpu_data(Dtype* data) {
  CHECK(data);
  data_->set_cpu_data(data);
}




<div class="se-preview-section-delimiter"></div>

拷贝函数

Blob提供了从其他类对象拷贝的成员函数:

public:
  void CopyFrom(const Blob<Dtype>& source, bool copy_diff = false, bool reshape = false);




<div class="se-preview-section-delimiter"></div>

其中copy_diff用于选择拷贝哪一类数据,false表示只拷贝data_数据,true表示只拷贝diff_数据.
如果reshape为false,则要求待拷贝对象的形状和被拷贝对象的形状相同,
如果为true则会在必要的情况将Blob reshape成被它拷贝的对象的形状.
该拷贝是深拷贝,即待拷贝的对象会额外分配内存,然后将数据复制过来.而不是简单的指向另外一个Blob的数据.
具体参考如下的实现:

template <typename Dtype>
void Blob<Dtype>::CopyFrom(const Blob& source, bool copy_diff, bool reshape) {
  // 形状不同则需要reshape
  if (source.count() != count_ || source.shape() != shape_) {
    if (reshape) {
      ReshapeLike(source);
    } else {
      LOG(FATAL) << "Trying to copy blobs of different sizes.";
    }
  }
  switch (Caffe::mode()) {
  case Caffe::GPU:
    if (copy_diff) {
      caffe_copy(count_, source.gpu_diff(),
          static_cast<Dtype*>(diff_->mutable_gpu_data()));
    } else {
      caffe_copy(count_, source.gpu_data(),
          static_cast<Dtype*>(data_->mutable_gpu_data()));
    }
    break;
  case Caffe::CPU:
    if (copy_diff) {
      caffe_copy(count_, source.cpu_diff(),
          static_cast<Dtype*>(diff_->mutable_cpu_data()));
    } else {
      caffe_copy(count_, source.cpu_data(),
          static_cast<Dtype*>(data_->mutable_cpu_data()));
    }
    break;
  default:
    LOG(FATAL) << "Unknown caffe mode.";
  }
}




<div class="se-preview-section-delimiter"></div>

与其他Blob共享

除了用CopyFrom函数,从另外一个Blob对象拷贝函数,Blob还支持与另外一个Blob对象共享参数或者梯度信息.

public:
  void ShareData(const Blob& other);
  void ShareDiff(const Blob& other);




<div class="se-preview-section-delimiter"></div>

共享diff_或data_利用了shared_ptr的性质. 简单地指向同一个对象即可.

public:
  void Blob<Dtype>::ShareData(const Blob& other) {
    CHECK_EQ(count_, other.count());
    data_ = other.data();
  }
  void Blob<Dtype>::ShareDiff(const Blob& other) {
    CHECK_EQ(count_, other.count());
    diff_ = other.diff();
  }




<div class="se-preview-section-delimiter"></div>

数据更新操作

如果当前 Blob 管理的是需要学习的神经网络参数, 那么可以通过调用 Update 来根据梯度信息对参数进行更新.

θ \leftarrow θ - ϵ g

${\boldsymbol{\theta}} \leftarrow \boldsymbol{\theta} - \epsilon\boldsymbol{g}$

由于 Blob 没有保存和learning rate 相关的数据成员,因此需要预先准备好 $\epsilon\boldsymbol{g}$ 并保存在 diff_中.

public:
  void Update();
  void Blob<Dtype>::Update() {
    // We will perform update based on where the data is located.
    switch (data_->head()) {
    case SyncedMemory::HEAD_AT_CPU:
      // perform computation on CPU
      caffe_axpy<Dtype>(count_, Dtype(-1),
          static_cast<const Dtype*>(diff_->cpu_data()),
          static_cast<Dtype*>(data_->mutable_cpu_data()));
      break;
    case SyncedMemory::HEAD_AT_GPU:
    case SyncedMemory::SYNCED:
  #ifndef CPU_ONLY
      // perform computation on GPU
      // after which synecd the data between CPU and GPU
      caffe_gpu_axpy<Dtype>(count_, Dtype(-1),
          static_cast<const Dtype*>(diff_->gpu_data()),
          static_cast<Dtype*>(data_->mutable_gpu_data()));
  #else
      NO_GPU;
  #endif
      break;
    default:
      LOG(FATAL) << "Syncedmem not initialized.";
    }
  }




<div class="se-preview-section-delimiter"></div>

其中caffe_axpy和caffe_gpu_axpy用于计算:

y = α x + y

${\boldsymbol{y}} = \alpha\boldsymbol{x} + \boldsymbol{y}$

简单的数学运算

一个Blob提供了若干简单的关于自身数据的简单运算.

p-Norm

Blob 有两个成员函数, 分别用于计算数据的 $l_1$ 和 $l_2$ 范数:

  // Compute the sum of absolute values (L1 norm) of the data.
  Dtype asum_data() const;
  // Compute the sum of absolute values (L1 norm) of the diff.
  Dtype asum_diff() const;
  // Compute the sum of squares (L2 norm squared) of the data.
  Dtype sumsq_data() const;
  // Compute the sum of squares (L2 norm squared) of the diff.
  Dtype sumsq_diff() const;




<div class="se-preview-section-delimiter"></div>

同样Blob只实现了float和double对应的计算norm的函数.

Dtype Blob<Dtype>::asum_data() const {
  if (!data_) { return 0; }
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    return caffe_cpu_asum(count_, cpu_data());
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:




<div class="se-preview-section-delimiter"></div>

#ifndef CPU_ONLY
  {
    Dtype asum;
    caffe_gpu_asum(count_, gpu_data(), &asum);
    return asum;
  }




<div class="se-preview-section-delimiter"></div>

#else
    NO_GPU;




<div class="se-preview-section-delimiter"></div>

#endif
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return 0;
}
Dtype Blob<Dtype>::sumsq_data() const {
  Dtype sumsq;
  const Dtype* data;
  if (!data_) { return 0; }
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    data = cpu_data();
    sumsq = caffe_cpu_dot(count_, data, data);
    break;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:




<div class="se-preview-section-delimiter"></div>

#ifndef CPU_ONLY
    data = gpu_data();
    caffe_gpu_dot(count_, data, data, &sumsq);




<div class="se-preview-section-delimiter"></div>

#else
    NO_GPU;




<div class="se-preview-section-delimiter"></div>

#endif
    break;
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return sumsq;
}




<div class="se-preview-section-delimiter"></div>

L1 norm的实现依靠caffe_cpu_asum和caffe_gpu_asum函数. L2 norm的实现依靠caffe_cpu_dot和caffe_gpu_dot函数.

数据缩放.

public:
  // Scale the blob data by a constant factor.
  void scale_data(Dtype scale_factor);
  // Scale the blob diff by a constant factor.
  void scale_diff(Dtype scale_factor);
  void Blob<Dtype>::scale_data(Dtype scale_factor) {
    Dtype* data;
    if (!data_) { return; }
    switch (data_->head()) {
      case SyncedMemory::HEAD_AT_CPU:
      data = mutable_cpu_data();
      caffe_scal(count_, scale_factor, data);
      return;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
  #ifndef CPU_ONLY
      data = mutable_gpu_data();
      caffe_gpu_scal(count_, scale_factor, data);
      return;
  #else
      NO_GPU;
  #endif
    case SyncedMemory::UNINITIALIZED:
      return;
    default:
      LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
    }
  }




<div class="se-preview-section-delimiter"></div>

从protobuf文件中读写Blob对象

public:
  // 从protobuf中拷贝blob
  void FromProto(const BlobProto& proto, bool reshape = true);
  // 写入blob到protobuf中
  void ToProto(BlobProto* proto, bool write_diff = false) const;