在caffe源码深入学习1中我们提到了caffe.cpp文件调用用户定义的solver.prototxt文件进行网络的训练,其中,网络训练的接口是train()函数,而在train()函数中,使用了Solve()这个函数接口去求解网络参数,那么,按照逻辑来说,接下来该解析solver.cpp文件,可是,事情并没有想象那么简单!如果打开solver.cpp文件,你会发现里面调用了Net相关的东西,这个时候就一头雾水了,然后就去翻找Net相关,发现Net中又和Layer相关,而Layer又和Blob是相关联的。因此,我们应该先进行blob相关的解析,弄懂caffe中数据底层构造,然后再进行Blob之上的Layer解析,再到Net与Solver,逐层往上爬,才能把caffe完整地理解清楚。
在这里先放一个链接:https://www.zhihu.com/question/27982282点击打开链接,笔者也看了其中的回答,觉得从"blob->layer->net->solver->综合->其他功能"这个顺序去阅读caffe源码不失为一种高效的学习caffe的手段,同时也推荐大家按照这样的顺序学习。
下面,我们从blob.hpp和blob.cpp说起!
按照惯例,先放注释的代码片。
首先是blob.hpp的代码片:
#ifndef CAFFE_BLOB_HPP_
#define CAFFE_BLOB_HPP_
#include <algorithm>
#include <string>
#include <vector>
#include "caffe/common.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/syncedmem.hpp"
const int kMaxBlobAxes = 32;
namespace caffe {
/**
* @brief A wrapper around SyncedMemory holders serving as the basic
* computational unit through which Layer%s, Net%s, and Solver%s
* interact.
*
* TODO(dox): more thorough description.
*/
template <typename Dtype>
class Blob {
/*Blob类首先是一个无参构造函数,初始化了一下Blob的基本信息,这四类基本信息在protected部分有所定义*/
public:
Blob()
: data_(), diff_(), count_(0), capacity_(0) {}
/// @brief Deprecated; use <code>Blob(const vector<int>& shape)</code>.
explicit Blob(const int num, const int channels, const int height,
const int width);//通过四个数(数量,通道数,高度,宽度)初始化Blob
explicit Blob(const vector<int>& shape);//通过shape矢量初始化Blob
/// @brief Deprecated; use <code>Reshape(const vector<int>& shape)</code>.
void Reshape(const int num, const int channels, const int height,
const int width);//通过四个数(数量,通道数,高度,宽度)改变Blob的形状
/**
* @brief Change the dimensions of the blob, allocating new memory if
* necessary.
*
* This function can be called both to create an initial allocation
* of memory, and to adjust the dimensions of a top blob during Layer::Reshape
* or Layer::Forward. When changing the size of blob, memory will only be
* reallocated if sufficient memory does not already exist, and excess memory
* will never be freed.
*
* Note that reshaping an input blob and immediately calling Net::Backward is
* an error; either Net::Forward or Net::Reshape need to be called to
* propagate the new input shape to higher layers.
*/
void Reshape(const vector<int>& shape);//通过shape矢量改变Blob的形状
void Reshape(const BlobShape& shape);//通过BlobShape类型的shape改变Blob的形状
void ReshapeLike(const Blob& other);//复制Blob中的数据
inline string shape_string() const {//定义内联函数输出Blob的形状
ostringstream stream;
for (int i = 0; i < shape_.size(); ++i) {
stream << shape_[i] << " ";
}
stream << "(" << count_ << ")";
return stream.str();
}
inline const vector<int>& shape() const { return shape_; }//定义一个内联函数返回shape类型的矢量
/**
* @brief Returns the dimension of the index-th axis (or the negative index-th
* axis from the end, if index is negative).
*
* @param index the axis index, which may be negative as it will be
* "canonicalized" using CanonicalAxisIndex.
* Dies on out of range index.
*/
/*根据shape的索引返回维相应信息,请注意在这里支持负索引,举个栗子,shape的数据顺序是(N,C,H,W),那么,shape(0)返回N,shape(-1)
返回W,shape(-2)返回H*/
inline int shape(int index) const {
return shape_[CanonicalAxisIndex(index)];
}
inline int num_axes() const { return shape_.size(); }//根据shape返回Blob的维度
inline int count() const { return count_; }//返回Blob中的数量,即按照shape的结构返回N*C*H*W
/**
* @brief Compute the volume of a slice; i.e., the product of dimensions
* among a range of axes.
*
* @param start_axis The first axis to include in the slice.
*
* @param end_axis The first axis to exclude from the slice.
*/
/*一个自定义的count函数,返回的乘积为shape从atart_axis自身开始到end_axis之前为止的shape中各个元素的乘积,如count(0,2)返回N*C*/
inline int count(int start_axis, int end_axis) const {
CHECK_LE(start_axis, end_axis);
CHECK_GE(start_axis, 0);
CHECK_GE(end_axis, 0);
CHECK_LE(start_axis, num_axes());
CHECK_LE(end_axis, num_axes());
int count = 1;
for (int i = start_axis; i < end_axis; ++i) {
count *= shape(i);
}
return count;
}
/**
* @brief Compute the volume of a slice spanning from a particular first
* axis to the final axis.
*
* @param start_axis The first axis to include in the slice.
*/
/*一个自定义的count函数,返回的乘积为shape从atart_axis自身开始到shape中最后一个元素的乘积*/
inline int count(int start_axis) const {
return count(start_axis, num_axes());
}
/**
* @brief Returns the 'canonical' version of a (usually) user-specified axis,
* allowing for negative indexing (e.g., -1 for the last axis).
*
* @param axis_index the axis index.
* If 0 <= index < num_axes(), return index.
* If -num_axes <= index <= -1, return (num_axes() - (-index)),
* e.g., the last axis index (num_axes() - 1) if index == -1,
* the second to last if index == -2, etc.
* Dies on out of range index.
*/
/*定义内联函数按照索引返回shape中的指定信息,并且在中间进行了索引为负值的检测*/
inline int CanonicalAxisIndex(int axis_index) const {
CHECK_GE(axis_index, -num_axes())
<< "axis " << axis_index << " out of range for " << num_axes()
<< "-D Blob with shape " << shape_string();
CHECK_LT(axis_index, num_axes())
<< "axis " << axis_index << " out of range for " << num_axes()
<< "-D Blob with shape " << shape_string();
if (axis_index < 0) {
return axis_index + num_axes();//返回索引为负值的时候的正确值
}
return axis_index;
}
/*下面是四个弃用的函数,作用是返回Blob的shape中的四个数值,使用shape(0),shape(1),shape(2),shape(3)代替*/
/// @brief Deprecated legacy shape accessor num: use shape(0) instead.
inline int num() const { return LegacyShape(0); }
/// @brief Deprecated legacy shape accessor channels: use shape(1) instead.
inline int channels() const { return LegacyShape(1); }
/// @brief Deprecated legacy shape accessor height: use shape(2) instead.
inline int height() const { return LegacyShape(2); }
/// @brief Deprecated legacy shape accessor width: use shape(3) instead.
inline int width() const { return LegacyShape(3); }
inline int LegacyShape(int index) const {//定义内联函数,同样返回的是shape的索引值
CHECK_LE(num_axes(), 4)
<< "Cannot use legacy accessors on Blobs with > 4 axes.";
CHECK_LT(index, 4);
CHECK_GE(index, -4);
if (index >= num_axes() || index < -num_axes()) {
// Axis is out of range, but still in [0, 3] (or [-4, -1] for reverse
// indexing) -- this special case simulates the one-padding used to fill
// extraneous axes of legacy blobs.
return 1;
}
return shape(index);
}
/*下面两个内联函数是返回Blob中的偏移量,Blob数据(N,C,H,W)的偏移量位置为(n*C+c)*H+h)*W+w */
inline int offset(const int n, const int c = 0, const int h = 0,
const int w = 0) const {
CHECK_GE(n, 0);
CHECK_LE(n, num());
CHECK_GE(channels(), 0);
CHECK_LE(c, channels());
CHECK_GE(height(), 0);
CHECK_LE(h, height());
CHECK_GE(width(), 0);
CHECK_LE(w, width());
return ((n * channels() + c) * height() + h) * width() + w;
}
/*同时也可以通过一个索引矢量返回Blob中的偏移量*/
inline int offset(const vector<int>& indices) const {
CHECK_LE(indices.size(), num_axes());
int offset = 0;
for (int i = 0; i < num_axes(); ++i) {
offset *= shape(i);
if (indices.size() > i) {
CHECK_GE(indices[i], 0);
CHECK_LT(indices[i], shape(i));
offset += indices[i];
}
}
return offset;
}
/**
* @brief Copy from a source Blob.
*
* @param source the Blob to copy from
* @param copy_diff if false, copy the data; if true, copy the diff
* @param reshape if false, require this Blob to be pre-shaped to the shape
* of other (and die otherwise); if true, Reshape this Blob to other's
* shape if necessary
*/
/*CopyFrom函数表示从source Blob中复制数据,而copy_diff和reshape参数则提供了复制内容与复制后是否处理的flag,详见.cpp文件*/
void CopyFrom(const Blob<Dtype>& source, bool copy_diff = false,
bool reshape = false);
//根据shape的四个参数返回cpu上存储的data(前向传输时使用)
inline Dtype data_at(const int n, const int c, const int h,
const int w) const {
return cpu_data()[offset(n, c, h, w)];
}
//根据shape的四个参数返回cpu上存储的diff(反向传输时使用)
inline Dtype diff_at(const int n, const int c, const int h,
const int w) const {
return cpu_diff()[offset(n, c, h, w)];
}
//根据index返回cpu上存储的data
inline Dtype data_at(const vector<int>& index) const {
return cpu_data()[offset(index)];
}
//根据index返回cpu上存储的diff
inline Dtype diff_at(const vector<int>& index) const {
return cpu_diff()[offset(index)];
}
//直接返回Blob的数据(cpu与gpu上存储的所有数据)
inline const shared_ptr<SyncedMemory>& data() const {
CHECK(data_);
return data_;
}
直接返回Blob的偏差(cpu与gpu上存储的所有偏差)
inline const shared_ptr<SyncedMemory>& diff() const {
CHECK(diff_);
return diff_;
}
const Dtype* cpu_data() const;//只读方式获取cpu上存储的数据指针
void set_cpu_data(Dtype* data);//手动设置cpu上存储的的数据
const int* gpu_shape() const;//只读方式获取gpu上面的shape参数
const Dtype* gpu_data() const;//只读方式获取gpu上存储的数据指针
const Dtype* cpu_diff() const;//只读方式获取cpu上存储的梯度指针
const Dtype* gpu_diff() const;//只读方式获取gpu上存储的梯度指针
Dtype* mutable_cpu_data();//获取cpu上存储的数据指针,一般在改变数据之前调用
Dtype* mutable_gpu_data();//变更gpu上存储的数据指针,一般在改变数据之前调用
Dtype* mutable_cpu_diff();//变更cpu上存储的梯度指针,一般在改变梯度之前调用
Dtype* mutable_gpu_diff();//变更gpu上存储的梯度指针,一般在改变梯度之前调用
void Update();//更新存储的data数据
void FromProto(const BlobProto& proto, bool reshape = true);//将数据从proto中读到Blob中
void ToProto(BlobProto* proto, bool write_diff = false) const;//将数据从Blob返回到proto中
/// @brief Compute the sum of absolute values (L1 norm) of the data.
Dtype asum_data() const;//求数据的L1范数
/// @brief Compute the sum of absolute values (L1 norm) of the diff.
Dtype asum_diff() const;//求梯度的L1范数
/// @brief Compute the sum of squares (L2 norm squared) of the data.
Dtype sumsq_data() const;//求数据的L2范数
/// @brief Compute the sum of squares (L2 norm squared) of the diff.
Dtype sumsq_diff() const;//求梯度的L2范数
/// @brief Scale the blob data by a constant factor.
void scale_data(Dtype scale_factor);//以倍数变更数据
/// @brief Scale the blob diff by a constant factor.
void scale_diff(Dtype scale_factor);//以倍数变更梯度
/**
* @brief Set the data_ shared_ptr to point to the SyncedMemory holding the
* data_ of Blob other -- useful in Layer%s which simply perform a copy
* in their Forward pass.
*
* This deallocates the SyncedMemory holding this Blob's data_, as
* shared_ptr calls its destructor when reset with the "=" operator.
*/
void ShareData(const Blob& other);//从另一个Blob共享数据
/**
* @brief Set the diff_ shared_ptr to point to the SyncedMemory holding the
* diff_ of Blob other -- useful in Layer%s which simply perform a copy
* in their Forward pass.
*
* This deallocates the SyncedMemory holding this Blob's diff_, as
* shared_ptr calls its destructor when reset with the "=" operator.
*/
void ShareDiff(const Blob& other);//从另一个Blob共享梯度
bool ShapeEquals(const BlobProto& other);//比较两个Blob是否相同
protected:
shared_ptr<SyncedMemory> data_;//Blob中的数据
shared_ptr<SyncedMemory> diff_;//Blob中的梯度
shared_ptr<SyncedMemory> shape_data_;//已经弃用,建议使用下一行的shape_替代
vector<int> shape_;//表示Blob的形状(N,C,H,W)
int count_;//Blob中数据总量,N*C*H*W
int capacity_;//Blob的容量,因为Blob的形状会发生变化
DISABLE_COPY_AND_ASSIGN(Blob);
}; // class Blob
} // namespace caffe
#endif // CAFFE_BLOB_HPP_
然后是blob.cpp的代码片
#include <climits>
#include <vector>
#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/syncedmem.hpp"
#include "caffe/util/math_functions.hpp"
namespace caffe {
template <typename Dtype>//reshape函数,使用4个实数初始化Blob的shape
void Blob<Dtype>::Reshape(const int num, const int channels, const int height,
const int width) {
vector<int> shape(4);
shape[0] = num;
shape[1] = channels;
shape[2] = height;
shape[3] = width;
Reshape(shape);//在这里调用下文紧接的Reshape
}
template <typename Dtype>
void Blob<Dtype>::Reshape(const vector<int>& shape) {
CHECK_LE(shape.size(), kMaxBlobAxes);
count_ = 1;
shape_.resize(shape.size());
if (!shape_data_ || shape_data_->size() < shape.size() * sizeof(int)) {
shape_data_.reset(new SyncedMemory(shape.size() * sizeof(int)));
}
int* shape_data = static_cast<int*>(shape_data_->mutable_cpu_data());
for (int i = 0; i < shape.size(); ++i) {
CHECK_GE(shape[i], 0);
if (count_ != 0) {
CHECK_LE(shape[i], INT_MAX / count_) << "blob size exceeds INT_MAX";
}
count_ *= shape[i];
shape_[i] = shape[i];//在这里初始化shape的数据,最终shape的数据会写到shape_和shape_data中
shape_data[i] = shape[i];
}
if (count_ > capacity_) {
capacity_ = count_;//因为Blob中间存储的数据量,因此当数据量减少时,Blob的容量上限也会发生变化。
data_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
diff_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
}
}
template <typename Dtype>
void Blob<Dtype>::Reshape(const BlobShape& shape) {
CHECK_LE(shape.dim_size(), kMaxBlobAxes);
vector<int> shape_vec(shape.dim_size());//在这个reshape函数中,先将Blob的shape参数转化为vector<int>类型,然后再初始化
for (int i = 0; i < shape.dim_size(); ++i) {
shape_vec[i] = shape.dim(i);
}
Reshape(shape_vec);
}
template <typename Dtype>
void Blob<Dtype>::ReshapeLike(const Blob<Dtype>& other) {
Reshape(other.shape());//这个函数实现了用其他Blob的shape来初始化
}
template <typename Dtype>
Blob<Dtype>::Blob(const int num, const int channels, const int height,
const int width)
// capacity_ must be initialized before calling Reshape
: capacity_(0) {
Reshape(num, channels, height, width);//在这个函数中,先初始化了Blob的capacity_,然后用4个实数初始化了Blob的形状
}
template <typename Dtype>
Blob<Dtype>::Blob(const vector<int>& shape)
// capacity_ must be initialized before calling Reshape
: capacity_(0) {
Reshape(shape);//在这个函数中,先初始化了Blob的capacity_,然后用vector<int>& shape初始化了Blob的形状
}
template <typename Dtype>
const int* Blob<Dtype>::gpu_shape() const {
CHECK(shape_data_);
/*在这里执行的gpu_data的操作不是本cpp中的gpu_data,而是SyncedMemory类的gpu_data()方法,具体含义在解析SyncedMemory类说明,因为返回值类型是const...,在这里明白
是用只读方式得到gpu上面的数据指针*/
return (const int*)shape_data_->gpu_data();
}
template <typename Dtype>
const Dtype* Blob<Dtype>::cpu_data() const {
CHECK(data_);
return (const Dtype*)data_->cpu_data();//SyncedMemory类的cpu_data()方法,目的是只读方式得到cpu上面的数据指针
template <typename Dtype>
void Blob<Dtype>::set_cpu_data(Dtype* data) {
CHECK(data);
data_->set_cpu_data(data);//SyncedMemory类的set_cpu_data()方法,目的是将访问cpu数据的指针指向data,意味着访问cpu数据可以从data开始
}
template <typename Dtype>// 只读方式得到gpu上存储的数据指针
const Dtype* Blob<Dtype>::gpu_data() const {
CHECK(data_);
return (const Dtype*)data_->gpu_data();
}
template <typename Dtype>//只读方式得到cpu上存储的梯度指针
const Dtype* Blob<Dtype>::cpu_diff() const {
CHECK(diff_);
return (const Dtype*)diff_->cpu_data();
}
template <typename Dtype>//只读方式得到gpu上存储的梯度指针
const Dtype* Blob<Dtype>::gpu_diff() const {
CHECK(diff_);
return (const Dtype*)diff_->gpu_data();
}
template <typename Dtype>//得到cpu上存储的数据指针,一般在改变cpu上面的数据指针之前调用,还是使用了SyncedMemory类的mutable_cpu_data()
Dtype* Blob<Dtype>::mutable_cpu_data() {
CHECK(data_);
return static_cast<Dtype*>(data_->mutable_cpu_data());
}
template <typename Dtype>//得到gpu上存储的数据指针,一般在改变gou上面的数据之前调用,还是使用了SyncedMemory类的mutable_gpu_data()
Dtype* Blob<Dtype>::mutable_gpu_data() {
CHECK(data_);
return static_cast<Dtype*>(data_->mutable_gpu_data());
}
template <typename Dtype>//得到cpu上存储的梯度指针,一般在改变cpu上面存储的梯度之前调用,还是使用了SyncedMemory类的mutable_cpu_data(),不过是diff_指针调用的
Dtype* Blob<Dtype>::mutable_cpu_diff() {
CHECK(diff_);
return static_cast<Dtype*>(diff_->mutable_cpu_data());
}
template <typename Dtype>//得到gpu上存储的梯度指针,一般在改变gpu上面存储的梯度之前调用,还是使用了SyncedMemory类的mutable_gpu_data(),不过是diff_指针调用的
Dtype* Blob<Dtype>::mutable_gpu_diff() {
CHECK(diff_);
return static_cast<Dtype*>(diff_->mutable_gpu_data());
}
template <typename Dtype>//与另外一个Blob共享数据
void Blob<Dtype>::ShareData(const Blob& other) {
CHECK_EQ(count_, other.count());
data_ = other.data();//直接赋值数据
}
template <typename Dtype>//与另外一个blob共享梯度
void Blob<Dtype>::ShareDiff(const Blob& other) {
CHECK_EQ(count_, other.count());
diff_ = other.diff();//直接赋值梯度
}
// The "update" method is used for parameter blobs in a Net, which are stored
// as Blob<float> or Blob<double> -- hence we do not define it for
// Blob<int> or Blob<unsigned int>.
template <> void Blob<unsigned int>::Update() { NOT_IMPLEMENTED; }
template <> void Blob<int>::Update() { NOT_IMPLEMENTED; }
template <typename Dtype>
void Blob<Dtype>::Update() {
// We will perform update based on where the data is located.
switch (data_->head()) {
case SyncedMemory::HEAD_AT_CPU:
// perform computation on CPU
caffe_axpy<Dtype>(count_, Dtype(-1),//caffe_axpy函数存在math_function.cpp中,核心功能就是向量相加
static_cast<const Dtype*>(diff_->cpu_data()),//在这先获得梯度值
static_cast<Dtype*>(data_->mutable_cpu_data()));//然后进行数据的更新
break;
case SyncedMemory::HEAD_AT_GPU:
case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
// perform computation on GPU
caffe_gpu_axpy<Dtype>(count_, Dtype(-1),
static_cast<const Dtype*>(diff_->gpu_data()),
static_cast<Dtype*>(data_->mutable_gpu_data()));
#else
NO_GPU;
#endif
break;
default:
LOG(FATAL) << "Syncedmem not initialized.";
}
}
template <> unsigned int Blob<unsigned int>::asum_data() const {
NOT_IMPLEMENTED;
return 0;
}
template <> int Blob<int>::asum_data() const {
NOT_IMPLEMENTED;
return 0;
}
template <typename Dtype>
Dtype Blob<Dtype>::asum_data() const {
if (!data_) { return 0; }
switch (data_->head()) {
case SyncedMemory::HEAD_AT_CPU:
return caffe_cpu_asum(count_, cpu_data());//在这里应用caffe_cpu_asum函数进行数据的L1范数求取,caffe_cpu_asum函数存在math_function.cpp中
case SyncedMemory::HEAD_AT_GPU:
case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
{
Dtype asum;
caffe_gpu_asum(count_, gpu_data(), &asum);
return asum;
}
#else
NO_GPU;
#endif
case SyncedMemory::UNINITIALIZED:
return 0;
default:
LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
}
return 0;
}
template <> unsigned int Blob<unsigned int>::asum_diff() const {
NOT_IMPLEMENTED;
return 0;
}
template <> int Blob<int>::asum_diff() const {
NOT_IMPLEMENTED;
return 0;
}
template <typename Dtype>
Dtype Blob<Dtype>::asum_diff() const {
if (!diff_) { return 0; }
switch (diff_->head()) {
case SyncedMemory::HEAD_AT_CPU:
return caffe_cpu_asum(count_, cpu_diff());//梯度同之前的数据一样
case SyncedMemory::HEAD_AT_GPU:
case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
{
Dtype asum;
caffe_gpu_asum(count_, gpu_diff(), &asum);
return asum;
}
#else
NO_GPU;
#endif
case SyncedMemory::UNINITIALIZED:
return 0;
default:
LOG(FATAL) << "Unknown SyncedMemory head state: " << diff_->head();
}
return 0;
}
template <> unsigned int Blob<unsigned int>::sumsq_data() const {
NOT_IMPLEMENTED;
return 0;
}
template <> int Blob<int>::sumsq_data() const {
NOT_IMPLEMENTED;
return 0;
}
template <typename Dtype>
Dtype Blob<Dtype>::sumsq_data() const {
Dtype sumsq;
const Dtype* data;
if (!data_) { return 0; }
switch (data_->head()) {
case SyncedMemory::HEAD_AT_CPU:
data = cpu_data();
sumsq = caffe_cpu_dot(count_, data, data);//在这里应用caffe_cpu_dot函数进行数据的L1范数求取,caffe_cpu_dot函数存在math_function.cpp中
break;
case SyncedMemory::HEAD_AT_GPU:
case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
data = gpu_data();
caffe_gpu_dot(count_, data, data, &sumsq);
#else
NO_GPU;
#endif
break;
case SyncedMemory::UNINITIALIZED:
return 0;
default:
LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
}
return sumsq;
}
template <> unsigned int Blob<unsigned int>::sumsq_diff() const {
NOT_IMPLEMENTED;
return 0;
}
template <> int Blob<int>::sumsq_diff() const {
NOT_IMPLEMENTED;
return 0;
}
template <typename Dtype>
Dtype Blob<Dtype>::sumsq_diff() const {
Dtype sumsq;
const Dtype* diff;
if (!diff_) { return 0; }
switch (diff_->head()) {
case SyncedMemory::HEAD_AT_CPU:
diff = cpu_diff();
sumsq = caffe_cpu_dot(count_, diff, diff);//梯度同之前的数据一样
break;
case SyncedMemory::HEAD_AT_GPU:
case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
diff = gpu_diff();
caffe_gpu_dot(count_, diff, diff, &sumsq);
break;
#else
NO_GPU;
#endif
case SyncedMemory::UNINITIALIZED:
return 0;
default:
LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
}
return sumsq;
}
template <> void Blob<unsigned int>::scale_data(unsigned int scale_factor) {
NOT_IMPLEMENTED;
}
template <> void Blob<int>::scale_data(int scale_factor) {
NOT_IMPLEMENTED;
}
template <typename Dtype>
void Blob<Dtype>::scale_data(Dtype scale_factor) {
Dtype* data;
if (!data_) { return; }
switch (data_->head()) {
case SyncedMemory::HEAD_AT_CPU:
data = mutable_cpu_data();
caffe_scal(count_, scale_factor, data);//在这里使用caffe_scal进行数据的放缩
return;
case SyncedMemory::HEAD_AT_GPU:
case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
data = mutable_gpu_data();
caffe_gpu_scal(count_, scale_factor, data);
return;
#else
NO_GPU;
#endif
case SyncedMemory::UNINITIALIZED:
return;
default:
LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
}
}
template <> void Blob<unsigned int>::scale_diff(unsigned int scale_factor) {
NOT_IMPLEMENTED;
}
template <> void Blob<int>::scale_diff(int scale_factor) {
NOT_IMPLEMENTED;
}
template <typename Dtype>
void Blob<Dtype>::scale_diff(Dtype scale_factor) {
Dtype* diff;
if (!diff_) { return; }
switch (diff_->head()) {
case SyncedMemory::HEAD_AT_CPU:
diff = mutable_cpu_diff();
caffe_scal(count_, scale_factor, diff);//在这里使用caffe_scal进行梯度的放缩
return;
case SyncedMemory::HEAD_AT_GPU:
case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
diff = mutable_gpu_diff();
caffe_gpu_scal(count_, scale_factor, diff);
return;
#else
NO_GPU;
#endif
case SyncedMemory::UNINITIALIZED:
return;
default:
LOG(FATAL) << "Unknown SyncedMemory head state: " << diff_->head();
}
}
template <typename Dtype>
bool Blob<Dtype>::ShapeEquals(const BlobProto& other) {//在这里判断两个Blob是否相等,依次验证W,H,C,N.
if (other.has_num() || other.has_channels() ||
other.has_height() || other.has_width()) {
// Using deprecated 4D Blob dimensions --
// shape is (num, channels, height, width).
// Note: we do not use the normal Blob::num(), Blob::channels(), etc.
// methods as these index from the beginning of the blob shape, where legacy
// parameter blobs were indexed from the end of the blob shape (e.g., bias
// Blob shape (1 x 1 x 1 x N), IP layer weight Blob shape (1 x 1 x M x N)).
return shape_.size() <= 4 &&
LegacyShape(-4) == other.num() &&
LegacyShape(-3) == other.channels() &&
LegacyShape(-2) == other.height() &&
LegacyShape(-1) == other.width();
}
vector<int> other_shape(other.shape().dim_size());
for (int i = 0; i < other.shape().dim_size(); ++i) {
other_shape[i] = other.shape().dim(i);
}
return shape_ == other_shape;
}
/*该函数从source blob复制数据,bool类型的reshape控制需不需要用source blob的shape来变更目前的blob,
而copy_diff则判断是复制偏差还是复制数据,若copy_diff为真,则复制偏差,为假则复制数据*/
template <typename Dtype>
void Blob<Dtype>::CopyFrom(const Blob& source, bool copy_diff, bool reshape) {
if (source.count() != count_ || source.shape() != shape_) {
if (reshape) {
ReshapeLike(source);
} else {
LOG(FATAL) << "Trying to copy blobs of different sizes.";
}
}
switch (Caffe::mode()) {
case Caffe::GPU:
if (copy_diff) {
caffe_copy(count_, source.gpu_diff(),
static_cast<Dtype*>(diff_->mutable_gpu_data()));
} else {
caffe_copy(count_, source.gpu_data(),
static_cast<Dtype*>(data_->mutable_gpu_data()));
}
break;
case Caffe::CPU:
if (copy_diff) {
caffe_copy(count_, source.cpu_diff(),
static_cast<Dtype*>(diff_->mutable_cpu_data()));
} else {
caffe_copy(count_, source.cpu_data(),
static_cast<Dtype*>(data_->mutable_cpu_data()));
}
break;
default:
LOG(FATAL) << "Unknown caffe mode.";
}
}
template <typename Dtype>
void Blob<Dtype>::FromProto(const BlobProto& proto, bool reshape) {
if (reshape) {//FromProto函数从Proto复制了shape,数据和偏差到Blob中
vector<int> shape;
if (proto.has_num() || proto.has_channels() ||
proto.has_height() || proto.has_width()) {
// Using deprecated 4D Blob dimensions --
// shape is (num, channels, height, width).
shape.resize(4);
shape[0] = proto.num();
shape[1] = proto.channels();
shape[2] = proto.height();
shape[3] = proto.width();
} else {
shape.resize(proto.shape().dim_size());
for (int i = 0; i < proto.shape().dim_size(); ++i) {
shape[i] = proto.shape().dim(i);
}
}
Reshape(shape);
} else {
CHECK(ShapeEquals(proto)) << "shape mismatch (reshape not set)";
}
// copy data
Dtype* data_vec = mutable_cpu_data();
if (proto.double_data_size() > 0) {
CHECK_EQ(count_, proto.double_data_size());
for (int i = 0; i < count_; ++i) {
data_vec[i] = proto.double_data(i);
}
} else {
CHECK_EQ(count_, proto.data_size());
for (int i = 0; i < count_; ++i) {
data_vec[i] = proto.data(i);
}
}
if (proto.double_diff_size() > 0) {
CHECK_EQ(count_, proto.double_diff_size());
Dtype* diff_vec = mutable_cpu_diff();
for (int i = 0; i < count_; ++i) {
diff_vec[i] = proto.double_diff(i);
}
} else if (proto.diff_size() > 0) {
CHECK_EQ(count_, proto.diff_size());
Dtype* diff_vec = mutable_cpu_diff();
for (int i = 0; i < count_; ++i) {
diff_vec[i] = proto.diff(i);
}
}
}
template <>
void Blob<double>::ToProto(BlobProto* proto, bool write_diff) const {
proto->clear_shape();
for (int i = 0; i < shape_.size(); ++i) {
proto->mutable_shape()->add_dim(shape_[i]);
}
proto->clear_double_data();
proto->clear_double_diff();
const double* data_vec = cpu_data();
for (int i = 0; i < count_; ++i) {
proto->add_double_data(data_vec[i]);
}
if (write_diff) {
const double* diff_vec = cpu_diff();
for (int i = 0; i < count_; ++i) {
proto->add_double_diff(diff_vec[i]);
}
}
}
template <>//ToProto函数将信息从Blob中写入Proto中
void Blob<float>::ToProto(BlobProto* proto, bool write_diff) const {
proto->clear_shape();
for (int i = 0; i < shape_.size(); ++i) {
proto->mutable_shape()->add_dim(shape_[i]);
}
proto->clear_data();
proto->clear_diff();
const float* data_vec = cpu_data();//调用的是上文代码中的cpu_data();
for (int i = 0; i < count_; ++i) {
proto->add_data(data_vec[i]);
}
if (write_diff) {
const float* diff_vec = cpu_diff();//调用的是上文代码中的cpu_diff();
for (int i = 0; i < count_; ++i) {
proto->add_diff(diff_vec[i]);
}
}
}
INSTANTIATE_CLASS(Blob);
template class Blob<int>;
template class Blob<unsigned int>;
} // namespace caffe
整体代码及相关注释如上所示,Blob整体代码给人的感觉是结构非常清晰,绝大多数函数功能一目了然,同时对一些底层的功能有很完善的封装,提供了一个网络中数据流通与存储的良好解决方案。
下面先总结一下Blob中到底包含什么,或者说Blob到底是个什么东西。
Blob作为网络中存储数据的单位,包含了5类数据。第一类叫“数据”,用data_指针表示,数据参与了网络的前向传播;第二类叫“梯度”,用diff_指针表示,梯度是网络反向传播中的重要成员;第三类是shape_,里面主要有四个参数,N(数量),C(通道数),W(宽度),H(高度),这四个数描述了Blob的外观形状,同时表示了Blob的容量;第四类叫count_,描述了Blob数据量的大小,count_值为N*C*W*H;最后一类叫capacity_,描述了Blob的容积,capacity_的值是由count_决定的。
然后我们再看一看Blob类中都有些什么功能。
首先有一些和初始化和形状改变有关的函数:构造函数和Reshape,这一类函数完成的主要功能是对Blob的形状进行初始化或者改变。Reshape函数在layer中还会出现,主要作用是每层数据的维度长宽等都不一样,因此在网络传播时,Reshape还会大派用场,在这里先不展开。
Blob()
: data_(), diff_(), count_(0), capacity_(0) {}
explicit Blob(const int num, const int channels, const int height,
const int width);
explicit Blob(const vector<int>& shape);
void Reshape(const int num, const int channels, const int height,
const int width);
void Reshape(const vector<int>& shape);
void Reshape(const BlobShape& shape);
void ReshapeLike(const Blob& other);
其次,有若干count函数,count函数的主要作用是计算Blob的容量
inline int count() const { return count_; }
inline int count(int start_axis, int end_axis) const
inline int count(int start_axis) const
然后,就涉及到网络前传和反传的时候获得数据的一些函数
inline int offset(const vector<int>& indices) const
inline Dtype data_at(const int n, const int c, const int h,
const int w) const
inline Dtype diff_at(const int n, const int c, const int h,
const int w) const
inline Dtype data_at(const vector<int>& index) const
inline Dtype diff_at(const vector<int>& index) const
inline const shared_ptr<SyncedMemory>& data() const
inline const shared_ptr<SyncedMemory>& diff() const
const Dtype* cpu_data() const;
void set_cpu_data(Dtype* data);
const int* gpu_shape() const;
const Dtype* gpu_data() const;
const Dtype* cpu_diff() const;
const Dtype* gpu_diff() const;
Dtype* mutable_cpu_data();
Dtype* mutable_gpu_data();
Dtype* mutable_cpu_diff();
Dtype* mutable_gpu_diff();
offset函数主要目的是返回Blob上面具体的位置,而这个位置可以辅助data_at函数和diff_at函数返回指定位置的数据指针与梯度指针,而data()和diff()函数则直接返回了Blob的data_指针与diff_指针,set_cpu_data方法设置访问cpu的数据指针。然后,gpu_shape()函数得到gpu上存储的数据部分。接下来,还剩8个函数,有四个不带mutable,有四个带mutable,四个不带mutable的函数都是用来读的函数,只支持不改变数据的读,获得cpu/gpu上面的data/diff,而带mutable的函数支持更改cpu/gpu上的数据/梯度的读,返回data/diff指针,一般在改变cpu/gpu上面的数据/梯度之前调用。值得注意的是,这8个函数中都使用了SyncedMemory类的方法,只看Blob不是非常明了,这个留在下篇解析。
附带一个官方的例子参考,这个例子也充分说明了个SyncedMem 类同步gpu和gpu上的信息的功能:
// 假定数据在 CPU 上进行初始化,我们有一个 blob
const Dtype* foo;
Dtype* bar;
foo = blob.gpu_data(); // 数据从 CPU 复制到 GPU
foo = blob.cpu_data(); // 没有数据复制,两者都有最新的内容
bar = blob.mutable_gpu_data(); // 没有数据复制
// ... 一些操作 ...
bar = blob.mutable_gpu_data(); // 仍在 GPU,没有数据复制
foo = blob.cpu_data(); // 由于 GPU 修改了数值,数据从 GPU 复制到 CPU
foo = blob.gpu_data(); //没有数据复制,两者都有最新的内容
bar = blob.mutable_cpu_data(); // 依旧没有数据复制
bar = blob.mutable_gpu_data(); //数据从 CPU 复制到 GPU
bar = blob.mutable_cpu_data(); //数据从 GPU 复制到 CPU
void Update();
该函数的功能是获得梯度,并进行数据的更新。
还有一些函数支持Blob同Proto进行数据交换
void FromProto(const BlobProto& proto, bool reshape = true);
void ToProto(BlobProto* proto, bool write_diff = false) const;
最后还有一些工具函数,比如数据复制的copyFrom函数,对两个Blob的shape进行比较的ShapeEquals函数,和Blob之间共享梯度和数据的ShareData和ShareDiff函数,求数据与梯度的范数的asum_data,asum_diff,sumsq_data,sumsq_diff函数,求数据与梯度的倍数的scale_data,scale_diff函数等。
总而言之,Blob是caffe中数据底层最重要的类,在弄明白Blob之后,解析更高层的Layer和Net等就更得心应手!
最后,附加学习Blob的过程中翻阅的博客,感谢博主的指点!
楼燚(yì)航的blog:http://www.cnblogs.com/louyihang-loves-baiyan/
junmuzi的blog:http://blog.csdn.net/junmuzi
欢迎阅读笔者后续解析caffe源码的博客,各位读者朋友的支持与鼓励是我最大的动力!
written by jiong
取乎其上,得乎其中;取乎其中,得乎其下;取乎其下,则无所得矣