pytorch: num_workers >1 的时候 Unable to open object (bad object header version number)

问题描述:

KeyError: 'Unable to open object (bad object header version number)'

原因:

模型的num_workers >1 且使用了h5py 文件,但是h5py文件不支持多线程读取(其实不是)

解决:

This issue could be solved and the solution is simple:【两个方案】

  1. Do not open hdf5 inside __init__ (不要再init里面初始化h5py文件)

  2. Open the hdf5 at the first data iteration. (在第一个getitem的时候初始化)

解决示例:

class LXRTDataLoader(torch.utils.data.Dataset):
    def __init__(self):
        """do not open hdf5 here!!"""

    def open_hdf5(self):
        self.img_hdf5 = h5py.File('img.hdf5', 'r')
        self.dataset = self.img_hdf5['dataset'] # if you want dataset.

    def __getitem__(self, item: int):
        if not hasattr(self, 'img_hdf5'):
            self.open_hdf5()
        img0 = self.img_hdf5['dataset'][0] # Do loading here
        img1 = self.dataset[1]
        return img0, img1
 

解释

The multi-processing actually happens when you create the data iterator (e.g., when callingfor datum in dataloader:):

In short, it would create multiple processes which "copy" the state of the current process. Thus the opened hdf5 file object would be dedicated to each subprocess if we open it at the first data iteration.

(翻译:简而言之,它将创建多个进程,这些进程“复制”当前进程的状态。因此,如果我们在第一次数据迭代时打开hdf5文件对象,那么它将专用于每个子进程。)


If you somehow create an hdfs file in __init__ and set up the `num_workers' > 0, it might cause two issues:(如果在主进程创建 hdfs文件且`num_workers' > 0 会导致两个问题:)

  1. The writing behavior is non-determistic. (We do not need to write to hdf5, thus this issue is ignored.)

  2. The state of the hdfs is copied, which might not faithfully indicate the current state.

In the previous way, we bypass this two issues.(现在我们避免了这两个问题)

 

参考:

https://github.com/pytorch/pytorch/issues/11929

  • 3
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值