报错信息
Traceback (most recent call last):
File "G:/EddyPredict/train.py", line 108, in <module>
for batch in training_loader:
File "D:\Miniconda3\envs\dl\lib\site-packages\torch\utils\data\dataloader.py", line 359, in __iter__
return self._get_iterator()
File "D:\Miniconda3\envs\dl\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "D:\Miniconda3\envs\dl\lib\site-packages\torch\utils\data\dataloader.py", line 918, in __init__
w.start()
File "D:\Miniconda3\envs\dl\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "D:\Miniconda3\envs\dl\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\Miniconda3\envs\dl\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\Miniconda3\envs\dl\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "D:\Miniconda3\envs\dl\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "src\netCDF4\_netCDF4.pyx", line 5475, in netCDF4._netCDF4.Variable.__reduce__
NotImplementedError: Variable is not picklable
原因
当自定义 Dataset , 又需要使用 num_workers > 1 的 DataLoader 时
必须保证在 DataLoader 启动多线程时, Dataset 是可序列化(picklable)的.
我在 Dataset 中打开了 netCDF4 文件, 该文件的引用是无法序列化的, 故报错.
lmdb同理.
解决方案
- 不要在 init 中打开文件
- 在第一次数据迭代时打开 文件
举例如下
class DataLoader(torch.utils.data.Dataset):
def __init__(self):
"""do not open lmdb here!!"""
def open_lmdb(self):
self.env = lmdb.open(self.lmdb_dir, readonly=True, create=False)
self.txn = self.env.begin(buffers=True)
def __getitem__(self, item: int):
if not hasattr(self, 'txn'):
self.open_lmdb()
"""
Then do anything you want with env/txn here.
"""