data_reader、internalthread以及blocking_queue的实现细节

最新推荐文章于 2022-08-18 09:24:16 发布
hqtgyj
最新推荐文章于 2022-08-18 09:24:16 发布
阅读量237
点赞数
分类专栏： caffe 文章标签： caffe
caffe 专栏收录该内容
3 篇文章 0 订阅
订阅专栏
                        
   版权声明：本文为博主原创文章，未经博主允许不得转载。 https://blog.csdn.net/xizero00/article/details/50901204 
 
（1）data_reader.cpp
  首先介绍一下boost::weak_ptr; 

  弱引用是为了解决shared_ptr在循环引用下的内存释放问题而产生的。 

  弱引用当引用的对象活着的时候不一定存在。仅仅是当它存在的时候的一个引用。弱引用并不修改该对象的引用计数，这意味这弱引用它并不对对象的内存进行管理，在功能上类似于普通指针，然而一个比较大的区别是， 
 弱引用能检测到所管理的对象是否已经被释放，从而避免访问非法内存。 

  由于弱引用不更改引用计数，类似普通指针，只要把循环引用的一方使用弱引用，即可解除循环引用。 
1）DataReader类中的变量： 
  
       shared_ptr<Body> body_;
      
       static 
       
       map<
       
       const 
       
       string, boost::weak_ptr<DataReader::Body> > bodies_;
      
       const 
       
       shared_ptr<QueuePair> queue_pair_;
      
2）此外还有构造函数： 
  
       explicit DataReader(const LayerParameter& param);
      
       内联函数：
      
       inline BlockingQueue<Datum*>& 
       
       free() 
       
       const {
      
       return queue_pair_->free_;
      
         }
      
       inline BlockingQueue<Datum*>& full() 
       
       const {
      
       return queue_pair_->full_;
      
         }
      
3）除此之外：
   内部还定义了一个Body类，该类是继承于InternalThread 
 
   内部还定义了一个QueuePair类，该类有free和full函数，该类用于在body和readers之间进行数据分享 
 
（2）此外该类还涉及到另一个类BlockingQueue，该类位于/util/block_queue.hpp里 
  
1）BlockingQueue类有成员函数 
  
       void push(const T& t);
      
       bool try_pop(T* t);
      
       T pop(const string& log_on_wait = "");
      
       bool try_peek(T* t);
      
       T peek();
      
2）此外该类内部还有一个sync的类（该类内部有同步机制和互斥机制）
   该类的定义如下： 
 
       template<
       
       typename T>
      
       class BlockingQueue<T>::sync {
      
       public:
      
       mutable boost::mutex mutex_;
      
         boost::condition_variable condition_;
      
       };
      
   该类内部包含一个mutex_互斥量 
 
   还有一个条件变量condition_ 
 
3）局部的变量有： 
  
       std::queue<T> queue_;
      
       shared_ptr<sync> sync_;
      
   BlockingQueue的push函数的实现如下： 
 
       void BlockingQueue<T>::push(
       
       const T& t) {
      
         boost::mutex::
       
       scoped_lock lock(sync_->mutex_); 
       
       //关于锁后面会详细讲
      
         queue_.push(t);
      
         lock.unlock();
      
         sync_->condition_.notify_one();
      
       }
      
   首先尝试锁住，然后将数据push到队列（queue_ 是std::queue<T> 类型的），然后unlock，条件变量通知。 
 
   BlockingQueue的try_pop函数的实现如下： 
 
       template<
       
       typename T>
      
       bool BlockingQueue<T>::try_pop(T* t) {
      
         boost::mutex::
       
       scoped_lock lock(sync_->mutex_); 
       
       // 
      
       if (queue_.empty()) {
      
       return 
       
       false;
      
         }
      
         *t = queue_.front();
      
         queue_.pop();
      
       return 
       
       true;
      
       }
      
这里插播一段关于互斥锁的知识：
   上述的代码中： 
 
   typedef unique_lock<mutex> scoped_lock; 
 
   scoped_lock是unique_lock<mutex>类型，因此通过查看boost的文档知道： 
 
   std::unique_lock<std::mutex> is the tool of choice when your locking needs are more complex than a simple lock at the beginning followed unconditionally by an unlock at the end. 
 
   也就是说当你的锁需求比简单的情况：一般的应用都是以lock开始，然后最后再unlock这样的情况，但是更复杂的时候你就需要scoped_lock。 
 
   参考文档： 
 
  http://web.archive.org/web/20140531071228/http://home.roadrunner.com/~hinnant/mutexes/locking.html 
 
   为了解释这种锁的必要性，考虑下面的例子： 
 
       class A
      
       {
      
       mutable 
       
       std::mutex  mut_;
      
       std::
       
       vector<
       
       double> data_;
      
       public:
      
       // ...
      
           A& 
       
       operator=(
       
       const A& rhs)
      
           {
      
       if (
       
       this != &rhs)
      
               {
      
       std::unique_lock<
       
       std::mutex> lhs_lock(mut_);
      
       std::unique_lock<
       
       std::mutex> rhs_lock(rhs.mut_);  
       
       // 死锁
      
       // assign data ...
      
                   data_ = rhs.data_;
      
               }
      
       return *
       
       this;
      
           }
      
       // ...
      
       };
      
    如果线程1： 
  
    A a1(); 
  
    另一个线程2复制： 
  
    A a2=a1; 
  
    而原先的线程1此时再赋值： 
  
    a1=a2;  
  
    这个时候就死锁了。。。碰到这个问题真是无解。。。 
  
    不过幸好我们还有解决方法，可以将上述代码写成： 
  
        class A
       
        {
       
        mutable 
        
        std::mutex  mut_;
       
        std::
        
        vector<
        
        double> data_;
       
        public:
       
        // ...
       
            A& 
        
        operator=(
        
        const A& rhs)
       
            {
       
        if (
        
        this != &rhs)
       
                {
       
        std::unique_lock<
        
        std::mutex> lhs_lock(    mut_, 
        
        std::defer_lock);  
        
        // 其定义为：struct defer_lock_t {};一个空的标记类而已 通常作为参数传入给 unique_lock 或 lock_guard 的构造函数
       
        std::unique_lock<
        
        std::mutex> rhs_lock(rhs.mut_, 
        
        std::defer_lock);
       
        std::lock(lhs_lock, rhs_lock);
       
        // assign data ...
       
                    data_ = rhs.data_;
       
                }
       
        return *
        
        this;
       
            }
       
        // ...
       
        };
       
     通过std::lock同时锁住两个，这样就能防止死锁了。 
   
     那么为什么新的代码能够避免这个问题： 
   
     a）首先lhs_lock和rhs_lock构建的时候是没有锁住的，因为unique_locks并没有引用他们（用了这个参数std::defer_lock ） 
   
     b）std::lock(lhs_lock, rhs_lock);同时所住着两个mutex，而不会死锁，这是它的功能 
   
     c）这儿不能用lock_guard是因为lock并不拥有所引用的mutex的模式，如果尝试编译safe_guard的话那么就无法编译 
   
     总结：也就是说遇到这种循环引用的，要先构建两个不锁的mutex，然后同时上锁（将两个资源上锁）。错误的代码是先锁住其中一个，然后再锁另一个。。。 
   
这里再插播关于条件变量的知识： 
    
     条件变量是提供了一种机制，该机制能够等待另一个线程发来的通知，如果另一个线程满足某个条件的话。通常使用条件变量是这样的，一个线程锁住mutex，然后wait，当该线程醒来的时候会检查条件变量的值是否true，如果是则放行，否则继续睡。。。 
   
     为了介绍条件变量，给出下面的例子： 
   
         boost::condition_variable cond;
        
         boost::mutex mut;
         
         bool data_ready;
        
         void process_data();
        
         void wait_for_data_to_process(){
        
           boost::unique_lock<boost::mutex> lock(mut);
        
         while(!data_ready)
         
         // lock保护变量data_ready
        
           {
        
           cond.wait(lock);
        
           }
        
           process_data();
        
         }
        
     上述代码的含义是：先定义一个lock，注意，此时是使用的unique_lock，并且mutex是关联上lock，也就是说此时是互斥的，假设处理数据的线程是多个的，然后用条件变量的wait，将线程陷入睡眠 
   
     此时另一个线程在准备数据 
   
         void retrieve_data();
        
         void prepare_data();
        
         void prepare_data_for_processing(){
        
           retrieve_data();
        
           prepare_data();
        
           {
        
           boost::lock_guard<boost::mutex> lock(mut);
        
           data_ready=
         
         true;
         
         // lock保护变量data_ready
        
           }
        
           cond.notify_one();
        
         }
        
     当多个准备数据线程坑次坑次把数据搞定后，发送通知，那么原来的线程就醒来开始干活。 
   
接下来继续BlockingQueue的实现代码： 
  
       BlockingQueue的pop函数的实现如下：
      
       template<
       
       typename T>
      
       T BlockingQueue<T>::pop(
       
       const 
       
       string& log_on_wait) {
      
         boost::mutex::
       
       scoped_lock lock(sync_->mutex_); 
       
       // 锁住
      
       while (queue_.empty()) {
      
       if (!log_on_wait.empty()) {
      
             LOG_EVERY_N(INFO, 
       
       1000)<< log_on_wait;
      
           }
      
           sync_->condition_.wait(lock); 
       
       // 如果队列一直为空则一直在等待
      
         }
      
         T t = queue_.front(); 
       
       // 否则取出
      
         queue_.pop();
      
       return t;
      
       }
      
   BlockingQueue的try_peek函数的实现如下： 
 
   该函数是判断队列首部是不是有数据 
 
       template<
       
       typename T>
      
       bool BlockingQueue<T>::try_peek(T* t) {
      
         boost::mutex::
       
       scoped_lock lock(sync_->mutex_);
      
       if (queue_.empty()) {
      
       return 
       
       false;
      
         }
      
         *t = queue_.front();
      
       return 
       
       true;
      
       }
      
   BlockingQueue的peek 函数的实现如下： 
 
   该函数取出队列首部的数据，同样也是使用的条件变量来实现同步 
 
       template<
       
       typename T>
      
       T BlockingQueue<T>::peek() {
      
         boost::mutex::
       
       scoped_lock lock(sync_->mutex_);
      
       while (queue_.empty()) {
      
           sync_->condition_.wait(lock);
      
         }
      
       return queue_.front();
      
       }
      
   BlockingQueue的size 函数的实现如下： 
 
       template<
       
       typename T>
      
       size_t BlockingQueue<T>::size() 
       
       const {
      
         boost::mutex::
       
       scoped_lock lock(sync_->mutex_);
      
       return queue_.size();
      
       }
      
   最后定义了几个类型的BlockingQueue类 
 
       template 
       
       class BlockingQueue<Batch<float>*>;
      
       template 
       
       class BlockingQueue<Batch<double>*>;
      
       template 
       
       class BlockingQueue<Datum*>;
      
       template 
       
       class BlockingQueue<shared_ptr<DataReader::QueuePair> >;
      
       template 
       
       class BlockingQueue<P2PSync<float>*>;
      
       template 
       
       class BlockingQueue<P2PSync<double>*>;
      
讲完了BlockingQueue类接下来讲DataReader内部的QueuePair类的实现： 
  
   首先甩出定义： 
 
       class QueuePair {
      
       public:
      
       explicit QueuePair(int size);
      
           ~QueuePair();
      
           BlockingQueue<Datum*> free_;
      
           BlockingQueue<Datum*> full_;
      
         DISABLE_COPY_AND_ASSIGN(QueuePair);
      
         };
      
   从定义里面可以看出定义了两个阻塞队列free_和full_，刚才分析了阻塞队列之后，这次回头看就不懵逼了。 
 
   接着看看具体实现： 
 
   构造函数做了些啥呢？ 
 
   就是根据给定的size初始化的若干个Datum（本文最后会给出该数据结构的定义）的实例到free里面。 
 
       DataReader::QueuePair::QueuePair(
       
       int size) {
      
       // Initialize the free queue with requested number of datums
      
       for (
       
       int i = 
       
       0; i < size; ++i) {
      
           free_.push(
       
       new Datum());
      
         }
      
       }
      
   析构函数做了些啥呢？ 
 
   就是将full_和free_这两个队列里面的Datum对象全部delete。 
 
       DataReader::QueuePair::~QueuePair() {
      
         Datum* datum;
      
       while (free_.try_pop(&datum)) {
      
       delete datum;
      
         }
      
       while (full_.try_pop(&datum)) {
      
       delete datum;
      
         }
      
       }
      
接下来看看Body类的实现，该类是继承自InternalThread 这个类的 
  
       class Body : 
       
       public InternalThread {
      
       public:
      
       explicit Body(const LayerParameter& param);
      
       virtual ~Body();
      
       protected:
      
       void InternalThreadEntry();
      
       void read_one(db::Cursor* cursor, QueuePair* qp);
      
       const LayerParameter param_;
      
           BlockingQueue<
       
       shared_ptr<QueuePair> > new_queue_pairs_;
      
       friend 
       
       class DataReader;
      
         DISABLE_COPY_AND_ASSIGN(Body);
      
         };
      
   Body里面重写了InternalThread内部的InternalThreadEntry函数，此外还添加了read_one函数 
 
   Body内部有DataReader的友元，以及BlockingQueue<shared_ptr<QueuePair> > new_queue_pairs_; 
 
为了弄清楚究竟干啥，有必要了解InternalThread这个类究竟干了哪些工作？ 
  
   InternalThread类实际上就是boost库的thread的封装 
 
   首先看看该类的定义是啥： 
 
       class InternalThread {
      
       public:
      
       // 构造函数和析构函数
      
         InternalThread() : thread_() {}
      
       virtual ~InternalThread();
      
       /**
      
          * Caffe's thread local state will be initialized using the current
      
          * thread values, e.g. device id, solver index etc. The random seed
      
          * is initialized using caffe_rng_rand.  
      
          *  caffe的线程局部状态将会使用当前线程值来进行初始化，当前的线程的值有设备id，solver的编号、随机数种子等
      
          */
      
       void StartInternalThread();
      
       /** Will not return until the internal thread has exited. */
      
       // 是否知道线程退出才返回
      
       void StopInternalThread();
      
       // 线程是否已经起来了
      
       bool is_started() const;
      
       protected:
      
       /* Implement this method in your subclass
      
             with the code you want your thread to run. */
      
       // 定义了一个虚函数，要求继承该类的必须要实现之
      
       virtual void InternalThreadEntry() {}
      
       /* Should be tested when running loops to exit when requested. */
      
       // 在当请求退出的时候应该调用该函数
      
       bool must_stop();
      
       private:
      
       void entry(int device, Caffe::Brew mode, int rand_seed, int solver_count,
      
       bool root_solver);
      
       // 内部的成员变量
      
       shared_ptr<boost::thread> thread_;
      
       };
      
       }  
       
       // namespace caffe
      
       好了，看完类的定义代码的注释之后。我们来看看具体的实现
      
       namespace caffe {
      
       // 析构函数，调用停止内部线程函数
      
       InternalThread::~InternalThread() {
      
         StopInternalThread();
      
       }
      
       // 测试线程是否起来
      
       bool InternalThread::is_started() 
       
       const {
      
       return thread_ && thread_->joinable(); 
       
       // 首先thread_指针不能为空，然后该线程是可等待的（joinable）
      
       }
      
       bool InternalThread::must_stop() {
      
       //  if interruption has been requested for the current thread, false otherwise. 见boost的doc
      
       return thread_ && thread_->interruption_requested();
      
       }
      
       // 初始化工作，然后
      
       void InternalThread::StartInternalThread() {
      
         CHECK(!is_started()) << 
       
       "Threads should persist and not be restarted.";
      
       int device = 
       
       0;
      
       #ifndef CPU_ONLY
      
         CUDA_CHECK(cudaGetDevice(&device));
      
       #endif
      
         Caffe::Brew mode = Caffe::mode();
      
       int rand_seed = caffe_rng_rand();
      
       int solver_count = Caffe::solver_count();
      
       bool root_solver = Caffe::root_solver();
      
       try {
       
       // 然后重新实例化一个thread对象给thread_指针，该线程的执行的是entry函数
      
           thread_.reset(
       
       new boost::thread(&InternalThread::entry, 
       
       this, device, mode,
      
                 rand_seed, solver_count, root_solver));
      
         } 
       
       catch (
       
       std::exception& e) {
      
           LOG(FATAL) << 
       
       "Thread exception: " << e.what();
      
         }
      
       }
      
       // 线程所要执行的函数
      
       void InternalThread::entry(
       
       int device, Caffe::Brew mode, 
       
       int rand_seed,
      
       int solver_count, 
       
       bool root_solver) {
      
       #ifndef CPU_ONLY
      
         CUDA_CHECK(cudaSetDevice(device));
      
       #endif
      
         Caffe::set_mode(mode);
      
         Caffe::set_random_seed(rand_seed);
      
         Caffe::set_solver_count(solver_count);
      
         Caffe::set_root_solver(root_solver);
      
         InternalThreadEntry();
      
       }
      
       // 停止线程
      
       void InternalThread::StopInternalThread() {
      
       if (is_started()) {
       
       // 如果线程已经开始
      
           thread_->interrupt();
       
       // 那么打断
      
       try {
      
             thread_->join();
       
       // 等待线程结束
      
           } 
       
       catch (boost::thread_interrupted&) {
       
       //如果被打断，啥也不干，因为是自己要打断的^_^
      
           } 
       
       catch (
       
       std::exception& e) {
       
       // 如果发生其他错误则记录到日志
      
             LOG(FATAL) << 
       
       "Thread exception: " << e.what();
      
           }
      
         }
      
       }
      
       }  
       
       // namespace caffe
      
  总结一下：无非就是获取线程的状态、启动线程、以及定义的线程入口函数InternalThread::entry ，这个入口函数很有意思，里面调用了虚函数InternalThreadEntry，并且在调用之前，帮用户做好了初始化的工作（随机数种子，CUDA、工作模式及GPU还是CPU、solver的类型）。 
 
好了插播了这么多，咱们回头继续看Body类的情况， 
  
       class Body : 
       
       public InternalThread {
      
       public:
      
       explicit Body(const LayerParameter& param);
      
       virtual ~Body();
      
       protected:
      
       void InternalThreadEntry();
      
       void read_one(db::Cursor* cursor, QueuePair* qp);
      
       const LayerParameter param_;
      
           BlockingQueue<
       
       shared_ptr<QueuePair> > new_queue_pairs_;
      
       friend 
       
       class DataReader;
      
         DISABLE_COPY_AND_ASSIGN(Body);
      
         };
      
  Body类里面果然重写了InternalThread的虚函数InternalThreadEntry。 
 
  我们来看看Body的情况 
 
       //Body类的构造函数，实际上是给定网络的参数，然后开始启动内部线程
      
       DataReader::Body::Body(
       
       const LayerParameter& param)
      
           : param_(param),
      
             new_queue_pairs_() {
      
         StartInternalThread();
       
       // 调用InternalThread内部的函数来初始化运行环境以及新建线程去执行虚函数InternalThreadEntry的内容
      
       }
      
       // 析构，停止线程
      
       DataReader::Body::~Body() {
      
         StopInternalThread();
      
       }
      
       // 自己实现的需要执行的函数
      
       // 首先打开数据库，然后设置游标，然后设置QueuePair指针容器
      
       void DataReader::Body::InternalThreadEntry() {
      
       // 获取所给定的数据源的类型来得到DB的指针
      
       shared_ptr<db::DB> db(db::GetDB(param_.data_param().backend()));
      
       // 从网络参数中给定的DB的位置打开DB
      
         db->Open(param_.data_param().source(), db::READ);
      
       // 新建游标指针
      
       shared_ptr<db::Cursor> cursor(db->NewCursor());
      
       // 新建QueuePair指针容器，QueuePair里面包含了free_和full_这两个阻塞队列
      
       vector<
       
       shared_ptr<QueuePair> > qps;
      
       try {
      
       // 根据网络参数的阶段来设置solver_count
      
       int solver_count = param_.phase() == TRAIN ? Caffe::solver_count() : 
       
       1;
      
       // To ensure deterministic runs, only start running once all solvers
      
       // are ready. But solvers need to peek on one item during initialization,
      
       // so read one item, then wait for the next solver.
      
       for (
       
       int i = 
       
       0; i < solver_count; ++i) {
      
       shared_ptr<QueuePair> qp(new_queue_pairs_.pop());
      
             read_one(cursor.get(), qp.get());
       
       // 读取一个数据
      
             qps.push_back(qp);压入
      
           }
      
       // Main loop
      
       while (!must_stop()) {
      
       for (
       
       int i = 
       
       0; i < solver_count; ++i) {
      
               read_one(cursor.get(), qps[i].get());
      
             }
      
       // Check no additional readers have been created. This can happen if
      
       // more than one net is trained at a time per process, whether single
      
       // or multi solver. It might also happen if two data layers have same
      
       // name and same source.
      
             CHECK_EQ(new_queue_pairs_.size(), 
       
       0);
      
           }
      
         } 
       
       catch (boost::thread_interrupted&) {
      
       // Interrupted exception is expected on shutdown
      
         }
      
       }
      
       // 从数据库中获取一个数据
      
       void DataReader::Body::read_one(db::Cursor* cursor, QueuePair* qp) {
      
       // 从QueuePair中的free_队列pop出一个
      
         Datum* datum = qp->free_.pop();
      
       // TODO deserialize in-place instead of copy?
      
       // 然后解析cursor中的值
      
         datum->ParseFromString(cursor->value());
      
       // 然后压入QueuePair中的full_队列
      
         qp->full_.push(datum);
      
       // go to the next iter
      
       // 游标指向下一个
      
         cursor->Next();
      
       if (!cursor->valid()) {
      
           DLOG(INFO) << 
       
       "Restarting data prefetching from start.";
      
           cursor->SeekToFirst();
       
       // 如果游标指向的位置已经无效了则指向第一个位置
      
         }
      
       }
      
OK接下来我们收拾DataReader类剩下的部分，这里我就偷个懒把DataReader类的所有代码的注释都贴上去。 
  
       #include <boost/thread.hpp>
      
       #include <map>
      
       #include <string>
      
       #include <vector>
      
       #include "caffe/common.hpp"
      
       #include "caffe/data_reader.hpp"
      
       #include "caffe/layers/data_layer.hpp"
      
       #include "caffe/proto/caffe.pb.h"
      
       namespace caffe {
      
       // 用于解决share_ptr在循环引用的时候的内存释放
      
       using boost::weak_ptr;
      
       map<
       
       const 
       
       string, weak_ptr<DataReader::Body> > DataReader::bodies_;
      
       static boost::mutex bodies_mutex_;
      
       // 构造函数，传入的是网络的参数、
      
       // 初始化queue_pair_（里面包含两个阻塞队列free_和full_）
      
       DataReader::DataReader(
       
       const LayerParameter& param)
      
           : queue_pair_(
       
       new QueuePair(  
       
       //
      
               param.data_param().prefetch() * param.data_param().batch_size())) {
      
       // Get or create a body
      
       // 首先创建或者获取一个body实例
      
         boost::mutex::
       
       scoped_lock lock(bodies_mutex_);
      
       string key = source_key(param);
       
       // 从网络参数中获取key
      
         weak_ptr<Body>& weak = bodies_[key];
       
       // bodies_是存放的string到Body的映射
      
         body_ = weak.lock();
      
       if (!body_) {
       
       // 如果bodies是空的
      
           body_.reset(
       
       new Body(param));
       
       // 则新建Body实例到body_
      
           bodies_[key] = weak_ptr<Body>(body_);
       
       // 然后存放到bodies_中去
      
         }
      
         body_->new_queue_pairs_.push(queue_pair_); 
       
       // 并将queue_pair放入body_中的new_queue_pairs_中去
      
       }
      
       // 析构函数
      
       DataReader::~DataReader() {
      
       string key = source_key(body_->param_);
      
         body_.reset();
      
         boost::mutex::
       
       scoped_lock lock(bodies_mutex_);
       
       // 上锁
      
       if (bodies_[key].expired()) {
      
           bodies_.erase(key);
       
       // map里面的erase
      
         }
      
       }
      
       //
      
       DataReader::QueuePair::QueuePair(
       
       int size) {
      
       // Initialize the free queue with requested number of datums
      
       // 一开始全部压入free
      
       for (
       
       int i = 
       
       0; i < size; ++i) {
      
           free_.push(
       
       new Datum());
      
         }
      
       }
      
       // 删除free_和full_内的datum
      
       DataReader::QueuePair::~QueuePair() {
      
         Datum* datum;
      
       while (free_.try_pop(&datum)) {
      
       delete datum;
      
         }
      
       while (full_.try_pop(&datum)) {
      
       delete datum;
      
         }
      
       }
      
       //Body类的构造函数，实际上是给定网络的参数，然后开始启动内部线程
      
       DataReader::Body::Body(
       
       const LayerParameter& param)
      
           : param_(param),
      
             new_queue_pairs_() {
      
         StartInternalThread();
       
       // 调用InternalThread内部的函数来初始化运行环境以及新建线程去执行虚函数InternalThreadEntry的内容
      
       }
      
       // 析构，停止线程
      
       DataReader::Body::~Body() {
      
         StopInternalThread();
      
       }
      
       // 自己实现的需要执行的函数
      
       // 首先打开数据库，然后设置游标，然后设置QueuePair指针容器
      
       void DataReader::Body::InternalThreadEntry() {
      
       // 获取所给定的数据源的类型来得到DB的指针
      
       shared_ptr<db::DB> db(db::GetDB(param_.data_param().backend()));
      
       // 从网络参数中给定的DB的位置打开DB
      
         db->Open(param_.data_param().source(), db::READ);
      
       // 新建游标指针
      
       shared_ptr<db::Cursor> cursor(db->NewCursor());
      
       // 新建QueuePair指针容器，QueuePair里面包含了free_和full_这两个阻塞队列
      
       vector<
       
       shared_ptr<QueuePair> > qps;
      
       try {
      
       // 根据网络参数的阶段来设置solver_count
      
       int solver_count = param_.phase() == TRAIN ? Caffe::solver_count() : 
       
       1;
      
       // To ensure deterministic runs, only start running once all solvers
      
       // are ready. But solvers need to peek on one item during initialization,
      
       // so read one item, then wait for the next solver.
      
       for (
       
       int i = 
       
       0; i < solver_count; ++i) {
      
       shared_ptr<QueuePair> qp(new_queue_pairs_.pop());
      
             read_one(cursor.get(), qp.get());
       
       // 读取一个数据
      
             qps.push_back(qp);压入
      
           }
      
       // Main loop
      
       while (!must_stop()) {
      
       for (
       
       int i = 
       
       0; i < solver_count; ++i) {
      
               read_one(cursor.get(), qps[i].get());
      
             }
      
       // Check no additional readers have been created. This can happen if
      
       // more than one net is trained at a time per process, whether single
      
       // or multi solver. It might also happen if two data layers have same
      
       // name and same source.
      
             CHECK_EQ(new_queue_pairs_.size(), 
       
       0);
      
           }
      
         } 
       
       catch (boost::thread_interrupted&) {
      
       // Interrupted exception is expected on shutdown
      
         }
      
       }
      
       // 从数据库中获取一个数据
      
       void DataReader::Body::read_one(db::Cursor* cursor, QueuePair* qp) {
      
       // 从QueuePair中的free_队列pop出一个
      
         Datum* datum = qp->free_.pop();
      
       // TODO deserialize in-place instead of copy?
      
       // 然后解析cursor中的值
      
         datum->ParseFromString(cursor->value());
      
       // 然后压入QueuePair中的full_队列
      
         qp->full_.push(datum);
      
       // go to the next iter
      
       // 游标指向下一个
      
         cursor->Next();
      
       if (!cursor->valid()) {
      
           DLOG(INFO) << 
       
       "Restarting data prefetching from start.";
      
           cursor->SeekToFirst();
       
       // 如果游标指向的位置已经无效了则指向第一个位置
      
         }
      
       }
      
       }  
       
       // namespace caffe
      
  总结：实际上该数据层就是调用了封装层的DB来读取数据，此外还简单封装了boost的线程库，然后自己封装了个阻塞队列。 
 
    最后还有Datum究竟是哈 
  
    可以看caffe.proto文件中的定义 
  
    message Datum { 
  
      optional int32 channels = 1; 
  
      optional int32 height = 2; 
  
      optional int32 width = 3; 
  
      // the actual image data, in bytes 
  
      optional bytes data = 4; 
  
      optional int32 label = 5; 
  
      // Optionally, the datum could also hold float data. 
  
      repeated float float_data = 6; 
  
      // If true data contains an encoded image that need to be decoded 
  
      optional bool encoded = 7 [default = false]; 
  
    } 
  
参考：
   [1]我猜你有可能需要boost的知识 
 
   关于unique_lock 
 
  http://zh.cppreference.com/w/cpp/thread/unique_lock 
 
   file:///C:/Program%20Files/boost_1_60_0/doc/html/thread/synchronization.html#thread.synchronization.mutex_types.mutex 
 
   关于同步机制的(Handling mutexes in C++) 
 
  http://web.archive.org/web/20140531071228/http://home.roadrunner.com/~hinnant/mutexes/locking.html 
 
   [2]如果你安装了boost的文档，你可以在找到关于线程的知识 
 
   file:///C:/Program%20Files/boost_1_60_0/doc/html/thread/thread_management.html#thread.thread_management.this_thread.interruption_requested 
 
  http://blog.chinaunix.net/uid-23093301-id-86385.html 
 
   [3]关于弱指针的知识 
 
  http://blog.csdn.net/mmzsyx/article/details/8090849 
 
  http://baike.baidu.com/link?url=-mb6Lc2iMwP0kzcAyszaJ1gugtcnlSLHeq2UT5SGdVXVgsg_ppDcin4PLTVrfAlsrm4t5focfsS9d9-Z-ZOWBq 
 
  http://www.cnblogs.com/TianFang/archive/2008/09/20/1294590.html
hqtgyj
关注
0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
data_reader、internalthread以及blocking_queue的实现细节

版权声明：本文为博主原创文章，未经博主允许不得转载。 https://blog.csdn.net/xizero00/article/details/50901204 （1）data_reader.cpp首先介绍一下boost::weak_ptr;弱引用是为了解决shared_ptr在循环引用下...
复制链接

扫一扫
专栏目录