Caffe SGD shuffle mechanism

As we know, the SGD need to shuffle all data in an epoch(the number of data), and read data sequentially batch by batch. How Caffe implement the shuffle mechanism?

The input data format of caffe is leveldb, lmdb, image filelist, hdf5, the source code file is in src/layers:

data_layer.cpp, image_data_layer.cpp,windows_data_layer.cpp,hdf5_data_layer.cpp

Please see the calling code:

void DataLayer<Dtype>::DataLayerSetup()

1.For leveldb and lmdb, please see data_layer.cpp, Caffe suppose the input of leveldb and lmdb has random shuffle, so Caffe does not need to shuffle the input again, it will read the database sequentially. When the database read cursor reaches the end, it will seek to first again.

2.For image file list, please see image_data_layer.cpp, Caffe read data line by line(each line contains filePath, label), then if the data layer has shuffle option, it will shuffle data for each epoch. The code is:

if (this->layer_param_.image_data_param().shuffle()) {
    // randomly shuffle data
    LOG(INFO) << "Shuffling data";
    const unsigned int prefetch_rng_seed = caffe_rng_rand();
    prefetch_rng_.reset(new Caffe::RNG(prefetch_rng_seed));
    ShuffleImages();
  }
if (lines_id_ >= lines_size) {
      // We have reached the end. Restart from the first.
      DLOG(INFO) << "Restarting data prefetching from start.";
      lines_id_ = 0;
      if (this->layer_param_.image_data_param().shuffle()) {
        ShuffleImages();
      }
    }

3. For hdf5, we need to do shuffle when we store hdf5.


评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值