Caffe-classification

13 篇文章 1 订阅
2 篇文章 0 订阅

常用的东西还是记下来,省的每次用的时候都要从头捋一遍。
本文是对用caffe做classification的inference流程的梳理。

做classification的主线:
int main(int argc, char** argv) {
  if (argc != 6) {
    std::cerr << "Usage: " << argv[0]
              << " deploy.prototxt network.caffemodel"
              << " mean.binaryproto labels.txt img.jpg" << std::endl;
    return 1;
  }

  ::google::InitGoogleLogging(argv[0]);

  string model_file   = argv[1];
  string trained_file = argv[2];
  string mean_file    = argv[3];
  string label_file   = argv[4];
  /* 1. 构建网络 */
  Classifier classifier(model_file, trained_file, mean_file, label_file);

  string file = argv[5];

  std::cout << "---------- Prediction for "
            << file << " ----------" << std::endl;
  /* 2. 读入图像 */
  cv::Mat img = cv::imread(file, -1);
  CHECK(!img.empty()) << "Unable to decode image " << file;
  /* 3. Predict */
  std::vector<Prediction> predictions = classifier.Classify(img);
  /* 4. 输出结果 */
  /* Print the top N predictions. */
  for (size_t i = 0; i < predictions.size(); ++i) {
    Prediction p = predictions[i];
    std::cout << std::fixed << std::setprecision(4) << p.second << " - \""
              << p.first << "\"" << std::endl;
  }
}

从上面代码可看出做classification的主线是:构建网络→读入图像→Predict→输出结果

做classification的类Classifier:
/* Pair (label, confidence) representing a prediction. */
/* 保存predict的结果:标签 + 概率得分 */
typedef std::pair<string, float> Prediction;
/* classification所用类,如果要做detection可以作为参考 */
class Classifier {
 public:
  Classifier(const string& model_file,        /* develop.prototxt */
             const string& trained_file,      /* .caffemodel */
             const string& mean_file,         /* mean.binaryproto */
             const string& label_file);       /* label.txt */
  /* 做classification的接口函数,输入图像和想要输出的topN的inference结果 */
  std::vector<Prediction> Classify(const cv::Mat& img, int N = 5);

 private:
  /* 根据输入的均值文件binaryproto计算均值图cv::Mat mean_,
     然后在Preprocess()中使用mean_ */
  void SetMean(const string& mean_file);
  /* 做classification的inference最主要的函数,
     输入是图像,输出是所有类别的概率得分,这里的类别排序和train时的类别标签是一一对应的 */
  std::vector<float> Predict(const cv::Mat& img);
  /* 在WrapInputLayer()中将net_input_blobs_[0](即input_layer)的每个通道的data塞进
     std::vector<cv::Mat>* input_channels,这样在Preprocess()中将const cv::Mat& img
     的每个通道的data赋给std::vector<cv::Mat>* input_channels就等同于赋给了net_input_blobs_[0] */
  void WrapInputLayer(std::vector<cv::Mat>* input_channels);
  /* 在Preprocess()中完成了将图像送入网络前的所有工作,包括颜色通道转换、resize、减均值、分配通道顺序等。
     例如,如果net_要求输入单通道灰度图,待inference的图像是彩色图,那么会将其转换为灰度图,反之亦然;
     将输入的const cv::Mat& img进行通道分离(split),将原始BRG,BGR,...,BGR排列方式
     split成B,B,...,B,G,G,...,G,R,R,...R的形式塞入input_channels */
  void Preprocess(const cv::Mat& img,
                  std::vector<cv::Mat>* input_channels);

 private:
  /* 此处的net_是用来做inference的,因此去除了所有只在train中用的层 */
  shared_ptr<Net<float> > net_;
  /* 网络要求的图片输入宽高 */
  cv::Size input_geometry_;
  /* 网络要求的图片输入通道数目 */
  int num_channels_;
  /* 由输入的均值文件binaryproto换算得到的均值图 */
  cv::Mat mean_;
  /* 样本类别标签,顺序要和train时的类别编号一一对应,例如:
     猫、狗、鱼三分类时训练的样本可能是:
     fish1.jpg 1
     cat1.jpg 0
     dog1.jpg 2
     dog2.jpg 2
     fish2.jpg 1
     cat2.jpg 0
     那么label.txt的内容应该是:
     cat
     fish
     dog */
  std::vector<string> labels_;
};
Classifier构造函数:
Classifier::Classifier(const string& model_file,
                       const string& trained_file,
                       const string& mean_file,
                       const string& label_file) {
/* 计算模式选择,如果是为了熟悉Caffe使用CPU更便于调试跟踪数据流 */
#ifdef CPU_ONLY
  Caffe::set_mode(Caffe::CPU);
#else
  Caffe::set_mode(Caffe::GPU);
#endif

  /* Load the network. */
  /* 根据deploy.prototxt new一个inference的net赋给net_ */
  net_.reset(new Net<float>(model_file, TEST));
  /* 从.caffemodel中拷贝训练好的参数给net_ */
  net_->CopyTrainedLayersFrom(trained_file);

  CHECK_EQ(net_->num_inputs(), 1) << "Network should have exactly one input.";
  CHECK_EQ(net_->num_outputs(), 1) << "Network should have exactly one output.";

  Blob<float>* input_layer = net_->input_blobs()[0];
  num_channels_ = input_layer->channels();
  CHECK(num_channels_ == 3 || num_channels_ == 1)
    << "Input layer should have 1 or 3 channels.";
  input_geometry_ = cv::Size(input_layer->width(), input_layer->height());

  /* Load the binaryproto mean file. */
  SetMean(mean_file);

  /* Load labels. */
  std::ifstream labels(label_file.c_str());
  CHECK(labels) << "Unable to open labels file " << label_file;
  string line;
  while (std::getline(labels, line))
    labels_.push_back(string(line));

  Blob<float>* output_layer = net_->output_blobs()[0];
  CHECK_EQ(labels_.size(), output_layer->channels())
    << "Number of labels is different from the output layer dimension.";
}
net_.reset(new Net<float>(model_file, TEST))

创建网络主要是Net的构造函数实现的,如下:

template <typename Dtype>
Net<Dtype>::Net(const string& param_file, Phase phase,
    const int level, const vector<string>* stages,
    const Net* root_net)
    : root_net_(root_net) {
  NetParameter param;
  /* 从param_file(develop.prototxt)中读取各个layer到param中 */
  ReadNetParamsFromTextFileOrDie(param_file, &param);
  // Set phase, stages and level
  param.mutable_state()->set_phase(phase);
  if (stages != NULL) {
    for (int i = 0; i < stages->size(); i++) {
      param.mutable_state()->add_stage((*stages)[i]);
    }
  }
  param.mutable_state()->set_level(level);
  /* 在Init()中根据param中的内容给new出来的net中的参数赋值,
     因此构造函数中最重要的是成员函数Init()的实现 */
  Init(param);
}
void Net<Dtype>::Init(const NetParameter& in_param)

Init()中完成了整个网络的构造。

template <typename Dtype>
/* 将从develop.prototxt中读取到in_param中的参数一一赋给net的成员 */
void Net<Dtype>::Init(const NetParameter& in_param) {
  CHECK(Caffe::root_solver() || root_net_)
      << "root_net_ needs to be set for all non-root solvers";
  // Set phase from the state.
  phase_ = in_param.state().phase();
  // Filter layers based on their include/exclude rules and
  // the current NetState.
  NetParameter filtered_param;
  /* 将.prototxt中对inference无用的层去掉,例如只在train中使用到的层 */
  FilterNet(in_param, &filtered_param);
  LOG_IF(INFO, Caffe::root_solver())
      << "Initializing net from parameters: " << std::endl
      << filtered_param.DebugString();
  // Create a copy of filtered_param with splits added where necessary.
  NetParameter param;
  /* 网络中的数据层的输出label对应accuracy层和loss层,则需要在数据层再插入一层 */
  InsertSplits(filtered_param, &param);
  // Basically, build all the layers and set up their connections.
  name_ = param.name();
  map<string, int> blob_name_to_idx;
  set<string> available_blobs;
  memory_used_ = 0;
  // For each layer, set up its input and output
  bottom_vecs_.resize(param.layer_size());
  top_vecs_.resize(param.layer_size());
  bottom_id_vecs_.resize(param.layer_size());
  param_id_vecs_.resize(param.layer_size());
  top_id_vecs_.resize(param.layer_size());
  bottom_need_backward_.resize(param.layer_size());
  for (int layer_id = 0; layer_id < param.layer_size(); ++layer_id) {
    // For non-root solvers, whether this layer is shared from root_net_.
    bool share_from_root = !Caffe::root_solver()
        && root_net_->layers_[layer_id]->ShareInParallel();
    // Inherit phase from net if unset.
    if (!param.layer(layer_id).has_phase()) {
      param.mutable_layer(layer_id)->set_phase(phase_);
    }
    // Setup layer.
    const LayerParameter& layer_param = param.layer(layer_id);
    if (layer_param.propagate_down_size() > 0) {
      CHECK_EQ(layer_param.propagate_down_size(),
          layer_param.bottom_size())
          << "propagate_down param must be specified "
          << "either 0 or bottom_size times ";
    }
    if (share_from_root) {
      LOG(INFO) << "Sharing layer " << layer_param.name() << " from root net";
      layers_.push_back(root_net_->layers_[layer_id]);
      layers_[layer_id]->SetShared(true);
    } else {
      layers_.push_back(LayerRegistry<Dtype>::CreateLayer(layer_param));
    }
    layer_names_.push_back(layer_param.name());
    LOG_IF(INFO, Caffe::root_solver())
        << "Creating Layer " << layer_param.name();
    bool need_backward = false;

    // Figure out this layer's input and output
    for (int bottom_id = 0; bottom_id < layer_param.bottom_size();
         ++bottom_id) {
      const int blob_id = AppendBottom(param, layer_id, bottom_id,
                                       &available_blobs, &blob_name_to_idx);
      // If a blob needs backward, this layer should provide it.
      need_backward |= blob_need_backward_[blob_id];
    }
    int num_top = layer_param.top_size();
    for (int top_id = 0; top_id < num_top; ++top_id) {
      AppendTop(param, layer_id, top_id, &available_blobs, &blob_name_to_idx);
      // Collect Input layer tops as Net inputs.
      if (layer_param.type() == "Input") {
        const int blob_id = blobs_.size() - 1;
        net_input_blob_indices_.push_back(blob_id);
        net_input_blobs_.push_back(blobs_[blob_id].get());
      }
    }
    // If the layer specifies that AutoTopBlobs() -> true and the LayerParameter
    // specified fewer than the required number (as specified by
    // ExactNumTopBlobs() or MinTopBlobs()), allocate them here.
    Layer<Dtype>* layer = layers_[layer_id].get();
    if (layer->AutoTopBlobs()) {
      const int needed_num_top =
          std::max(layer->MinTopBlobs(), layer->ExactNumTopBlobs());
      for (; num_top < needed_num_top; ++num_top) {
        // Add "anonymous" top blobs -- do not modify available_blobs or
        // blob_name_to_idx as we don't want these blobs to be usable as input
        // to other layers.
        AppendTop(param, layer_id, num_top, NULL, NULL);
      }
    }
    // After this layer is connected, set it up.
    if (share_from_root) {
      // Set up size of top blobs using root_net_
      const vector<Blob<Dtype>*>& base_top = root_net_->top_vecs_[layer_id];
      const vector<Blob<Dtype>*>& this_top = this->top_vecs_[layer_id];
      for (int top_id = 0; top_id < base_top.size(); ++top_id) {
        this_top[top_id]->ReshapeLike(*base_top[top_id]);
        LOG(INFO) << "Created top blob " << top_id << " (shape: "
            << this_top[top_id]->shape_string() <<  ") for shared layer "
            << layer_param.name();
      }
    } else {
      layers_[layer_id]->SetUp(bottom_vecs_[layer_id], top_vecs_[layer_id]);
    }
    LOG_IF(INFO, Caffe::root_solver())
        << "Setting up " << layer_names_[layer_id];
    for (int top_id = 0; top_id < top_vecs_[layer_id].size(); ++top_id) {
      if (blob_loss_weights_.size() <= top_id_vecs_[layer_id][top_id]) {
        blob_loss_weights_.resize(top_id_vecs_[layer_id][top_id] + 1, Dtype(0));
      }
      blob_loss_weights_[top_id_vecs_[layer_id][top_id]] = layer->loss(top_id);
      LOG_IF(INFO, Caffe::root_solver())
          << "Top shape: " << top_vecs_[layer_id][top_id]->shape_string();
      if (layer->loss(top_id)) {
        LOG_IF(INFO, Caffe::root_solver())
            << "    with loss weight " << layer->loss(top_id);
      }
      memory_used_ += top_vecs_[layer_id][top_id]->count();
    }
    LOG_IF(INFO, Caffe::root_solver())
        << "Memory required for data: " << memory_used_ * sizeof(Dtype);
    const int param_size = layer_param.param_size();
    const int num_param_blobs = layers_[layer_id]->blobs().size();
    CHECK_LE(param_size, num_param_blobs)
        << "Too many params specified for layer " << layer_param.name();
    ParamSpec default_param_spec;
    for (int param_id = 0; param_id < num_param_blobs; ++param_id) {
      const ParamSpec* param_spec = (param_id < param_size) ?
          &layer_param.param(param_id) : &default_param_spec;
      const bool param_need_backward = param_spec->lr_mult() != 0;
      need_backward |= param_need_backward;
      layers_[layer_id]->set_param_propagate_down(param_id,
                                                  param_need_backward);
    }
    for (int param_id = 0; param_id < num_param_blobs; ++param_id) {
      AppendParam(param, layer_id, param_id);
    }
    // Finally, set the backward flag
    layer_need_backward_.push_back(need_backward);
    if (need_backward) {
      for (int top_id = 0; top_id < top_id_vecs_[layer_id].size(); ++top_id) {
        blob_need_backward_[top_id_vecs_[layer_id][top_id]] = true;
      }
    }
  }
  // Go through the net backwards to determine which blobs contribute to the
  // loss.  We can skip backward computation for blobs that don't contribute
  // to the loss.
  // Also checks if all bottom blobs don't need backward computation (possible
  // because the skip_propagate_down param) and so we can skip bacward
  // computation for the entire layer
  set<string> blobs_under_loss;
  set<string> blobs_skip_backp;
  for (int layer_id = layers_.size() - 1; layer_id >= 0; --layer_id) {
    bool layer_contributes_loss = false;
    bool layer_skip_propagate_down = true;
    for (int top_id = 0; top_id < top_vecs_[layer_id].size(); ++top_id) {
      const string& blob_name = blob_names_[top_id_vecs_[layer_id][top_id]];
      if (layers_[layer_id]->loss(top_id) ||
          (blobs_under_loss.find(blob_name) != blobs_under_loss.end())) {
        layer_contributes_loss = true;
      }
      if (blobs_skip_backp.find(blob_name) == blobs_skip_backp.end()) {
        layer_skip_propagate_down = false;
      }
      if (layer_contributes_loss && !layer_skip_propagate_down)
        break;
    }
    // If this layer can skip backward computation, also all his bottom blobs
    // don't need backpropagation
    if (layer_need_backward_[layer_id] && layer_skip_propagate_down) {
      layer_need_backward_[layer_id] = false;
      for (int bottom_id = 0; bottom_id < bottom_vecs_[layer_id].size();
               ++bottom_id) {
        bottom_need_backward_[layer_id][bottom_id] = false;
      }
    }
    if (!layer_contributes_loss) { layer_need_backward_[layer_id] = false; }
    if (Caffe::root_solver()) {
      if (layer_need_backward_[layer_id]) {
        LOG(INFO) << layer_names_[layer_id] << " needs backward computation.";
      } else {
        LOG(INFO) << layer_names_[layer_id]
            << " does not need backward computation.";
      }
    }
    for (int bottom_id = 0; bottom_id < bottom_vecs_[layer_id].size();
         ++bottom_id) {
      if (layer_contributes_loss) {
        const string& blob_name =
            blob_names_[bottom_id_vecs_[layer_id][bottom_id]];
        blobs_under_loss.insert(blob_name);
      } else {
        bottom_need_backward_[layer_id][bottom_id] = false;
      }
      if (!bottom_need_backward_[layer_id][bottom_id]) {
        const string& blob_name =
                   blob_names_[bottom_id_vecs_[layer_id][bottom_id]];
        blobs_skip_backp.insert(blob_name);
      }
    }
  }
  // Handle force_backward if needed.
  if (param.force_backward()) {
    for (int layer_id = 0; layer_id < layers_.size(); ++layer_id) {
      layer_need_backward_[layer_id] = true;
      for (int bottom_id = 0;
           bottom_id < bottom_need_backward_[layer_id].size(); ++bottom_id) {
        bottom_need_backward_[layer_id][bottom_id] =
            bottom_need_backward_[layer_id][bottom_id] ||
            layers_[layer_id]->AllowForceBackward(bottom_id);
        blob_need_backward_[bottom_id_vecs_[layer_id][bottom_id]] =
            blob_need_backward_[bottom_id_vecs_[layer_id][bottom_id]] ||
            bottom_need_backward_[layer_id][bottom_id];
      }
      for (int param_id = 0; param_id < layers_[layer_id]->blobs().size();
           ++param_id) {
        layers_[layer_id]->set_param_propagate_down(param_id, true);
      }
    }
  }
  // In the end, all remaining blobs are considered output blobs.
  for (set<string>::iterator it = available_blobs.begin();
      it != available_blobs.end(); ++it) {
    LOG_IF(INFO, Caffe::root_solver())
        << "This network produces output " << *it;
    net_output_blobs_.push_back(blobs_[blob_name_to_idx[*it]].get());
    net_output_blob_indices_.push_back(blob_name_to_idx[*it]);
  }
  for (size_t blob_id = 0; blob_id < blob_names_.size(); ++blob_id) {
    blob_names_index_[blob_names_[blob_id]] = blob_id;
  }
  for (size_t layer_id = 0; layer_id < layer_names_.size(); ++layer_id) {
    layer_names_index_[layer_names_[layer_id]] = layer_id;
  }
  ShareWeights();
  debug_info_ = param.debug_info();
  LOG_IF(INFO, Caffe::root_solver()) << "Network initialization done.";
}
net_->CopyTrainedLayersFrom(trained_file)

该函数主要调用CopyTrainedLayersFrom()

template <typename Dtype>
void Net<Dtype>::CopyTrainedLayersFromBinaryProto(
    const string trained_filename) {
  NetParameter param;
  ReadNetParamsFromBinaryFileOrDie(trained_filename, &param);
  CopyTrainedLayersFrom(param);
}

CopyTrainedLayersFrom中实现了参数的拷贝:

template <typename Dtype>
void Net<Dtype>::CopyTrainedLayersFrom(const NetParameter& param) {
  /* net_的层数,一般会比develop.prototxt中的网络层数要多,
     因为经过net.cpp中InsertSplits()中处理之后层数会变多 */
  int num_source_layers = param.layer_size();
  /* for循环中遍历.caffemodel的每层进行参数拷贝,具体地,
     将develop.prototxt中的每层名字与.caffemodel中的每层名字进行比对,
     比对上了就将.caffemodel中对应层的参数拷贝过来 */
  for (int i = 0; i < num_source_layers; ++i) {
    /* source_layer是指.caffemodel中的layer */
    const LayerParameter& source_layer = param.layer(i);
    const string& source_layer_name = source_layer.name();
    int target_layer_id = 0;
    /* 由于.caffemodel是训练时根据train.prototxt得到的,因此这里需要跳过train时专用的层 */
    while (target_layer_id != layer_names_.size() &&
        layer_names_[target_layer_id] != source_layer_name) {
      ++target_layer_id;
    }
    if (target_layer_id == layer_names_.size()) {
      LOG(INFO) << "Ignoring source layer " << source_layer_name;
      continue;
    }
    LOG(INFO) << "Copying source layer " << source_layer_name;
    /* target_blobs是指develop.prototxt中含有参数的某层的blob */
    vector<shared_ptr<Blob<Dtype> > >& target_blobs =
        layers_[target_layer_id]->blobs();
    CHECK_EQ(target_blobs.size(), source_layer.blobs_size())
        << "Incompatible number of blobs for layer " << source_layer_name;
    /* 有参数的层在下面的for循环中对当前层进行参数拷贝,
       没有参数的层target_blobs.size()==0不会进入for循环。
       对于conv层其size为2,表示有两类参数:w、b,for循环中j=0是对w进行赋值,j=1是对b进行赋值 */
    for (int j = 0; j < target_blobs.size(); ++j) {
      if (!target_blobs[j]->ShapeEquals(source_layer.blobs(j))) {
        Blob<Dtype> source_blob;
        const bool kReshape = true;
        source_blob.FromProto(source_layer.blobs(j), kReshape);
        LOG(FATAL) << "Cannot copy param " << j << " weights from layer '"
            << source_layer_name << "'; shape mismatch.  Source param shape is "
            << source_blob.shape_string() << "; target param shape is "
            << target_blobs[j]->shape_string() << ". "
            << "To learn this layer's parameters from scratch rather than "
            << "copying from a saved net, rename the layer.";
      }
      const bool kReshape = false;
      /* 对每个参数逐一赋值。
         例如,如果输入channel为3、conv核大小为3*3、输出通道为64,
         则conv层的w共3*3*3*64=1728个参数,那么在j=0时FromProto()中的for循环中会有1728次循环逐一对1728个参数进行赋值;
         另外,conv层还有64个bias,那么在j=1时FromProto()中的for循环中会有64次循环逐一对64个参数进行赋值 */
      target_blobs[j]->FromProto(source_layer.blobs(j), kReshape);
    }
  }
}

caffe之网络权重可视化(C++实现)
caffe之特征图可视化及特征提取
Caffe源码导读

PyTorch, Darknet, Caffe, and TensorFlow are all popular deep learning frameworks used for building and training neural networks. PyTorch is an open-source machine learning library that is widely used for deep learning tasks. It was developed by Facebook AI Research and is known for its ease of use and flexibility. PyTorch allows users to define and train their models using dynamic computational graphs, making it a popular choice among researchers and developers. Darknet is an open-source neural network framework that is used primarily for object detection and recognition tasks. It is written in C and CUDA and is known for its fast performance and accuracy. Darknet has been used to build popular object detection models such as YOLO (You Only Look Once). Caffe is a deep learning framework that is known for its speed and scalability. It was developed by the Berkeley Vision and Learning Center and is widely used for image and video classification tasks. Caffe has a large community of users and is often used in academic research. TensorFlow is an open-source machine learning framework developed by Google Brain. It is known for its scalability and flexibility, and is widely used for deep learning tasks such as image and speech recognition. TensorFlow provides a high-level API for building and training neural networks, making it a popular choice among developers. Each of these frameworks has its own strengths and weaknesses, and the choice of which one to use will depend on the specific task at hand and the user's preferences and experience.
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值