MTCNN训练环境搭建(一)

最新推荐文章于 2022-03-05 21:56:27 发布

romeosgood

最新推荐文章于 2022-03-05 21:56:27 发布

阅读量771

点赞数

原文链接：https://blog.csdn.net/wwww1244/article/details/81034045

多标签分类在工程上有很多应用，例如，输入一张图片，判断这个人的年龄、性别和是否配戴眼镜。这时，数据集的label文件应当具有这样的格式：

000001.jpg 22 1 0
000002.jpg 30 1 1
000003.jpg 44 0 1
000004.jpg 17 0 0

假定第一个数字表示年龄，第二个0/1表示女/男，第三个0/1表示不戴眼镜/戴眼镜。

同样地，回归问题在CNN中也有很多应用，例如，在物体检测领域（例如Faster R-CNN、SSD、Yolo等网络），输入一张图片，输出bounding box的平移缩放参数（当然，实际情况会比这复杂一点。。）。这时，数据集的label文件应当具有这样的格式：

000001.jpg -0.01 -0.29 -0.05 0.02
000002.jpg 0.25 -0.04 0.02 -0.07
000003.jpg 0.23 0.05 -0.22 0.15
000004.jpg 0.20 -0.12 0.02 -0.20

四个参数分别表示x、y、w、h的变化量。

（实际上，上面是从MTCNN人脸检测网络的训练数据中截取的bounding box回归部分，关于MTCNN训练数据格式下一篇会提到）

--------------------------------------------------------------------------------

caffe对单标签分类问题提供了很好的支持，但是对多标签分类/回归问题的支持却不是很好，用于生成lmdb数据集的可执行文件build/tools/convert_imageset只能接受单标签和整形数，显然，多标签分类/回归问题需要程序接受多标签和浮点型数，这就是我们要做的工作。

需要注意的是，即使用浮点型数读入分类标签，也不会影响分类结果。

我希望能够继续使用lmdb数据集，这就需要修改一些caffe源码，虽然修改过程复杂了点，但因为基本是一劳永逸的，所以还是值得一试的。

参考博客：用 caffe 做回归（上）

这位dalao的博客基本上可以解决问题，我对他做过的修改进行了汇总，总的来说，共需要修改以下几个文件：

tools/convert_imageset.cpp：用于生成lmdb数据集。将该文件复制一份，重命名为tools/convert_imageset_multi.cpp并修改该文件。
include/caffe/util/io.hpp：为上面的程序提供新的函数
src/caffe/util/io.cpp
src/caffe/layers/data_layer.cpp：用于加载lmdb数据集
src/caffe/proto/caffe.proto：用于添加label_num参数

首先声明，我用的是BVLC版本的caffe（2018.7）：https://github.com/BVLC/caffe

如果caffe版本不一样，这些文件很可能不能直接使用。接下来就是这些文件的修改版，其中，//###表示修改的部分，主要思路就是将int型数据转变为vector<float>型数据：

1. tools/convert_imageset_multi.cpp：


 
 
   
   
    
    
   
   
   
   
    
    
     
     // This program converts a set of images to a lmdb/leveldb by storing them
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     // as Datum proto buffers.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     // Usage:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     //   convert_imageset [FLAGS] ROOTFOLDER/ LISTFILE DB_NAME
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     //
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     // where ROOTFOLDER is the root folder that holds all the images, and LISTFILE
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     // should be a list of files as well as their labels, in the format as
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     //   subfolder1/file1.JPEG 7
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     //   ....
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <algorithm>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <fstream> // NOLINT(readability/streams)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <string>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <utility>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <vector>
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "boost/scoped_ptr.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "gflags/gflags.h"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "glog/logging.h"
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/proto/caffe.pb.h"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/util/db.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/util/format.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/util/io.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/util/rng.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <iostream> //###
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <boost/tokenizer.hpp> //###
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     using 
     
     namespace caffe;  
     
     // NOLINT(build/namespaces)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     using 
     
     std::pair;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     using boost::scoped_ptr;
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_bool(gray, 
     
     false,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     "When this option is on, treat images as grayscale ones");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_bool(shuffle, 
     
     false,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     "Randomly shuffle the order of images and their labels");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_string(backend, 
     
     "lmdb",
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     "The backend {lmdb, leveldb} for storing the result");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_int32(resize_width, 
     
     0, 
     
     "Width images are resized to");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_int32(resize_height, 
     
     0, 
     
     "Height images are resized to");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_bool(check_size, 
     
     false,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     "When this option is on, check that all the datum have the same size");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_bool(encoded, 
     
     false,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     "When this option is on, the encoded image will be save in datum");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DEFINE_string(encode_type, 
     
     "",
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     "Optional: What type should we encode the image as ('png','jpg',...).");
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     int main(int argc, char** argv) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #ifdef USE_OPENCV
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       ::google::InitGoogleLogging(argv[
     
     0]);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Print output to stderr (while still logging)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       FLAGS_alsologtostderr = 
     
     1;
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #ifndef GFLAGS_GFLAGS_H_
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     namespace gflags = google;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #endif
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       gflags::SetUsageMessage(
     
     "Convert a set of images to the leveldb/lmdb\n"
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     "format used as input for Caffe.\n"
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     "Usage:\n"
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     "    convert_imageset [FLAGS] ROOTFOLDER/ LISTFILE DB_NAME\n"
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     "The ImageNet dataset for the training demo is at\n"
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     "    http://www.image-net.org/download-images\n");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       gflags::ParseCommandLineFlags(&argc, &argv, 
     
     true);
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (argc < 
     
     4) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         gflags::ShowUsageWithFlagsRestrict(argv[
     
     0], 
     
     "tools/convert_imageset");
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return 
     
     1;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     const 
     
     bool is_color = !FLAGS_gray;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     const 
     
     bool check_size = FLAGS_check_size;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     const 
     
     bool encoded = FLAGS_encoded;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     const 
     
     string encode_type = FLAGS_encode_type;
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     std::
     
     ifstream infile(argv[2]);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // std::vector<std::pair<std::string, int> > lines;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     std::
     
     vector<
     
     std::pair<
     
     std::
     
     string, 
     
     std::
     
     vector<
     
     float> > > lines;  
     
     //### int -> vector<float>
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     std::
     
     string line;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     size_t pos;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // int label;               //###
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     std::
     
     vector<
     
     float> labels;  
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     while (
     
     std::getline(infile, line)) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // pos = line.find_last_of(' ');
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // label = atoi(line.substr(pos + 1).c_str());
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // lines.push_back(std::make_pair(line.substr(0, pos), label));
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     std::
     
     vector<
     
     std::
     
     string> tokens;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         boost::char_separator<
     
     char> sep(
     
     " ");
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         boost::tokenizer<boost::char_separator<
     
     char> > tok(line, sep);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         tokens.clear();
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     std::copy(tok.begin(), tok.end(), 
     
     std::back_inserter(tokens));  
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     for (
     
     int i = 
     
     1; i < tokens.size(); ++i)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           labels.push_back(atof(tokens.at(i).c_str()));
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         lines.push_back(
     
     std::make_pair(tokens.at(
     
     0), labels));
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         labels.clear();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (FLAGS_shuffle) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // randomly shuffle data
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         LOG(INFO) << 
     
     "Shuffling data";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         shuffle(lines.begin(), lines.end());
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       LOG(INFO) << 
     
     "A total of " << lines.size() << 
     
     " images.";
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (encode_type.size() && !encoded)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         LOG(INFO) << 
     
     "encode_type specified, assuming encoded=true.";
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     int resize_height = 
     
     std::max<
     
     int>(
     
     0, FLAGS_resize_height);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     int resize_width = 
     
     std::max<
     
     int>(
     
     0, FLAGS_resize_width);
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Create new DB
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       scoped_ptr<db::DB> db(db::GetDB(FLAGS_backend));
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       db->Open(argv[
     
     3], db::NEW);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       scoped_ptr<db::Transaction> txn(db->NewTransaction());
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Storing to db
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     std::
     
     string root_folder(argv[1]);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       Datum datum;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     int count = 
     
     0;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     int data_size = 
     
     0;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     bool data_size_initialized = 
     
     false;
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     for (
     
     int line_id = 
     
     0; line_id < lines.size(); ++line_id) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     bool status;
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     std::
     
     string enc = encode_type;
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if (encoded && !enc.size()) {
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     // Guess the encoding type from the file name
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     string fn = lines[line_id].first;
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     size_t p = fn.rfind(
     
     '.');
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     if ( p == fn.npos )
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             LOG(WARNING) << 
     
     "Failed to guess the encoding of '" << fn << 
     
     "'";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           enc = fn.substr(p+
     
     1);
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     std::transform(enc.begin(), enc.end(), enc.begin(), ::
     
     tolower);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         status = ReadImageToDatum(root_folder + lines[line_id].first,  
     
     //### 没有修改，但是调用了新的函数
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             lines[line_id].second, resize_height, resize_width, is_color,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             enc, &datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if (status == 
     
     false) 
     
     continue;
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if (check_size) {
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     if (!data_size_initialized) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             data_size = datum.channels() * datum.height() * datum.width();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             data_size_initialized = 
     
     true;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           } 
     
     else {
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     const 
     
     std::
     
     string& data = datum.data();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             CHECK_EQ(data.size(), data_size) << 
     
     "Incorrect data field size "
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 << data.size();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // sequential
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     string key_str = caffe::format_int(line_id, 
     
     8) + 
     
     "_" + lines[line_id].first;
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // Put in db
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     string out;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         CHECK(datum.SerializeToString(&out));
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         txn->Put(key_str, out);
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if (++count % 
     
     1000 == 
     
     0) {
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     // Commit db
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           txn->Commit();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           txn.reset(db->NewTransaction());
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           LOG(INFO) << 
     
     "Processed " << count << 
     
     " files.";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // write the last batch
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (count % 
     
     1000 != 
     
     0) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         txn->Commit();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         LOG(INFO) << 
     
     "Processed " << count << 
     
     " files.";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #else
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       LOG(FATAL) << 
     
     "This tool requires OpenCV; compile with USE_OPENCV.";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #endif  // USE_OPENCV
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     return 
     
     0;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }

2. include/caffe/util/io.hpp（部分）：

找到ReadImageToDatum函数，并在下面添加（不要删除原函数）一个vector<float>类型的函数重载。


 
 
   
   
    
    
   
   
   
   
    
    
     
     bool ReadImageToDatum(const string& filename, const int label,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     int height, 
     
     const 
     
     int width, 
     
     const 
     
     bool is_color,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     std::
     
     string & encoding, Datum* datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     bool ReadImageToDatum(const string& filename, const vector<float> labels, //###
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     int height, 
     
     const 
     
     int width, 
     
     const 
     
     bool is_color,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     std::
     
     string & encoding, Datum* datum);

3. src/caffe/util/io.cpp（部分）：

添加上面函数的实现（这里注释掉的encoding部分与图像编码格式有关，默认encoding=''，所以这部分可以直接注释掉）。


 
 
   
   
    
    
   
   
   
   
    
    
     
     bool ReadImageToDatum(const string& filename, const int label,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     int height, 
     
     const 
     
     int width, 
     
     const 
     
     bool is_color,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     std::
     
     string & encoding, Datum* datum) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       cv::Mat cv_img = ReadImageToCVMat(filename, height, width, is_color);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (cv_img.data) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if (encoding.size()) {
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     if ( (cv_img.channels() == 
     
     3) == is_color && !height && !width &&
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
               matchExt(filename, encoding) )
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     return ReadFileToDatum(filename, label, datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     std::
     
     vector<uchar> buf;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           cv::imencode(
     
     "."+encoding, cv_img, buf);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           datum->set_data(
     
     std::
     
     string(
     
     reinterpret_cast<
     
     char*>(&buf[
     
     0]),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                           buf.size()));
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           datum->set_label(label);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           datum->set_encoded(
     
     true);
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     return 
     
     true;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         CVMatToDatum(cv_img, datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         datum->set_label(label);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return 
     
     true;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       } 
     
     else {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return 
     
     false;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     bool ReadImageToDatum(const string& filename, const vector<float> labels,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     int height, 
     
     const 
     
     int width, 
     
     const 
     
     bool is_color,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     std::
     
     string & encoding, Datum* datum) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       cv::Mat cv_img = ReadImageToCVMat(filename, height, width, is_color);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (cv_img.data) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // if (encoding.size()) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   if ( (cv_img.channels() == 3) == is_color && !height && !width &&
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //       matchExt(filename, encoding) )
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //     return ReadFileToDatum(filename, label, datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   std::vector<uchar> buf;
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   cv::imencode("."+encoding, cv_img, buf);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   datum->set_data(std::string(reinterpret_cast<char*>(&buf[0]),
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //                   buf.size()));
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   datum->set_label(label);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   datum->set_encoded(true);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   return true;
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         CVMatToDatum(cv_img, datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // datum->set_label(label);
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     for (
     
     int i = 
     
     0; i < labels.size(); ++i)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           datum->add_float_data(labels.at(i));
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return 
     
     true;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       } 
     
     else {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return 
     
     false;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }

4. src/caffe/layers/data_layer.cpp：


 
 
   
   
    
    
   
   
   
   
    
    
     
     #ifdef USE_OPENCV
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <opencv2/core/core.hpp>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #endif  // USE_OPENCV
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <stdint.h>
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include <vector>
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/data_transformer.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/layers/data_layer.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #include "caffe/util/benchmark.hpp"
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     namespace caffe {
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     template <
     
     typename Dtype>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DataLayer<Dtype>::DataLayer(
     
     const LayerParameter& param)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       : BasePrefetchingDataLayer<Dtype>(param),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         offset_() {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       db_.reset(db::GetDB(param.data_param().backend()));
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       db_->Open(param.data_param().source(), db::READ);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       cursor_.reset(db_->NewCursor());
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     template <
     
     typename Dtype>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DataLayer<Dtype>::~DataLayer() {
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     this->StopInternalThread();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     template <
     
     typename Dtype>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     void DataLayer<Dtype>::DataLayerSetUp(
     
     const 
     
     vector<Blob<Dtype>*>& bottom,
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     const 
     
     vector<Blob<Dtype>*>& top) {
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     const 
     
     int batch_size = 
     
     this->layer_param_.data_param().batch_size();
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Read a data point, and use it to initialize the top blob.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       Datum datum;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       datum.ParseFromString(cursor_->value());
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Use data_transformer to infer the expected blob shape from datum.
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     vector<
     
     int> top_shape = 
     
     this->data_transformer_->InferBlobShape(datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     this->transformed_data_.Reshape(top_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Reshape top[0] and prefetch_data according to the batch_size.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       top_shape[
     
     0] = batch_size;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       top[
     
     0]->Reshape(top_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     for (
     
     int i = 
     
     0; i < 
     
     this->prefetch_.size(); ++i) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     this->prefetch_[i]->data_.Reshape(top_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       LOG_IF(INFO, Caffe::root_solver())
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           << 
     
     "output data size: " << top[
     
     0]->num() << 
     
     ","
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           << top[
     
     0]->channels() << 
     
     "," << top[
     
     0]->height() << 
     
     ","
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           << top[
     
     0]->width();
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // label
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // if (this->output_labels_) {
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //   vector<int> label_shape(1, batch_size);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //   top[1]->Reshape(label_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //   for (int i = 0; i < this->prefetch_.size(); ++i) {
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //     this->prefetch_[i]->label_.Reshape(label_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //   }
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     const 
     
     int label_num = 
     
     this->layer_param_.data_param().label_num();
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (
     
     this->output_labels_) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     vector<
     
     int> label_shape;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         label_shape.push_back(batch_size);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         label_shape.push_back(label_num);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         label_shape.push_back(
     
     1);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         label_shape.push_back(
     
     1);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         top[
     
     1]->Reshape(label_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     for (
     
     int i = 
     
     0; i < 
     
     this->prefetch_.size(); ++i) {
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     this->prefetch_[i]->label_.Reshape(label_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     template <
     
     typename Dtype>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     bool DataLayer<Dtype>::Skip() {
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     int size = Caffe::solver_count();
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     int rank = Caffe::solver_rank();
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     bool keep = (offset_ % size) == rank ||
    
    
   
   

   
   
    
    
   
   
   
   
    
                  
     
     // In test mode, only rank 0 runs, so avoid skipping
    
    
   
   

   
   
    
    
   
   
   
   
    
                  
     
     this->layer_param_.phase() == TEST;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     return !keep;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     template<
     
     typename Dtype>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     void DataLayer<Dtype>::Next() {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       cursor_->Next();
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     if (!cursor_->valid()) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         LOG_IF(INFO, Caffe::root_solver())
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             << 
     
     "Restarting data prefetching from start.";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         cursor_->SeekToFirst();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       offset_++;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     // This function is called on prefetch thread
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     template<
     
     typename Dtype>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     void DataLayer<Dtype>::load_batch(Batch<Dtype>* batch) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       CPUTimer batch_timer;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       batch_timer.Start();
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     double read_time = 
     
     0;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     double trans_time = 
     
     0;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       CPUTimer timer;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       CHECK(batch->data_.count());
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       CHECK(
     
     this->transformed_data_.count());
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     const 
     
     int batch_size = 
     
     this->layer_param_.data_param().batch_size();
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       Datum datum;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     for (
     
     int item_id = 
     
     0; item_id < batch_size; ++item_id) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         timer.Start();
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     while (Skip()) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           Next();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         datum.ParseFromString(cursor_->value());
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         read_time += timer.MicroSeconds();
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if (item_id == 
     
     0) {
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     // Reshape according to the first datum of each batch
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     // on single input batches allows for inputs of varying dimension.
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     // Use data_transformer to infer the expected blob shape from datum.
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     vector<
     
     int> top_shape = 
     
     this->data_transformer_->InferBlobShape(datum);
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     this->transformed_data_.Reshape(top_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     // Reshape batch according to the batch_size.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           top_shape[
     
     0] = batch_size;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           batch->data_.Reshape(top_shape);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // Apply data transformations (mirror, scale, crop...)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         timer.Start();
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     int offset = batch->data_.offset(item_id);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         Dtype* top_data = batch->data_.mutable_cpu_data();
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     this->transformed_data_.set_cpu_data(top_data + offset);
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     this->data_transformer_->Transform(datum, &(
     
     this->transformed_data_));
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // Copy label.
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // if (this->output_labels_) {
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   Dtype* top_label = batch->label_.mutable_cpu_data();
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //   top_label[item_id] = datum.label();
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     // }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     //###
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     const 
     
     int label_num = 
     
     this->layer_param_.data_param().label_num();
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if (
     
     this->output_labels_) {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           Dtype* top_label = batch->label_.mutable_cpu_data();
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     for (
     
     int i = 
     
     0; i < label_num; i++){
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             top_label[item_id * label_num + i] = datum.float_data(i);  
     
     //read float labels
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         trans_time += timer.MicroSeconds();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         Next();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       timer.Stop();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       batch_timer.Stop();
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       DLOG(INFO) << 
     
     "Prefetch batch: " << batch_timer.MilliSeconds() << 
     
     " ms.";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       DLOG(INFO) << 
     
     "     Read time: " << read_time / 
     
     1000 << 
     
     " ms.";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       DLOG(INFO) << 
     
     "Transform time: " << trans_time / 
     
     1000 << 
     
     " ms.";
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     INSTANTIATE_CLASS(DataLayer);
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     REGISTER_LAYER_CLASS(Data);
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }  
     
     // namespace caffe

5. src/caffe/proto/caffe.proto（部分）：

在DataParameter中添加label_num项。


 
 
   
   
    
    
   
   
   
   
    
    
     
     message DataParameter {
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     enum DB {
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         LEVELDB = 
     
     0;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         LMDB = 
     
     1;
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       }
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Specify the data source.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional 
     
     string source = 
     
     1;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Specify the batch size.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional uint32 batch_size = 
     
     4;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // The rand_skip variable is for the data layer to skip a few data points
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // to avoid all asynchronous sgd clients to start at the same point. The skip
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // point would be set as rand_skip * rand(0,1). Note that rand_skip should not
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // be larger than the number of keys in the database.
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // DEPRECATED. Each solver accesses a different subset of the database.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional uint32 rand_skip = 
     
     7 [
     
     default = 
     
     0];
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional DB backend = 
     
     8 [
     
     default = LEVELDB];
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // DEPRECATED. See TransformationParameter. For data pre-processing, we can do
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // simple scaling and subtracting the data mean, if provided. Note that the
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // mean subtraction is always carried out before scaling.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional 
     
     float scale = 
     
     2 [
     
     default = 
     
     1];
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional 
     
     string mean_file = 
     
     3;
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // DEPRECATED. See TransformationParameter. Specify if we would like to randomly
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // crop an image.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional uint32 crop_size = 
     
     5 [
     
     default = 
     
     0];
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // DEPRECATED. See TransformationParameter. Specify if we want to randomly mirror
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // data.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional 
     
     bool mirror = 
     
     6 [
     
     default = 
     
     false];
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Force the encoded image to have 3 color channels
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional 
     
     bool force_encoded_color = 
     
     9 [
     
     default = 
     
     false];
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // Prefetch queue (Increase if data feeding bandwidth varies, within the
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     // limit of device memory for GPU training)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional uint32 prefetch = 
     
     10 [
     
     default = 
     
     4];
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     //### For multi-task training
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       optional uint32 label_num = 
     
     11 [
     
     default = 
     
     1];
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     }

到这里为止，程序就修改完成了，重新编译caffe，会生成新的build/tools/convert_imageset_multi可执行文件，利用它就可以读取开头所说的那种多标签label文件。

关于调用方法，下一篇再说。

romeosgood

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
MTCNN训练环境搭建(一)

多标签分类在工程上有很多应用，例如，输入一张图片，判断这个人的年龄、性别和是否配戴眼镜。这时，数据集的label文件应当具有这样的格式：000001.jpg 22 1 0 000002.jpg 30 1 1 000003.jpg 44 0 1 000004.jpg 17 0 0假定第一个数字表示年龄...
复制链接

扫一扫