一、multiprocessing.pool.RemoteTraceback

遇到如下问题多半时数据有问题`。

// A code block
var foo = 'bar';
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/unaguo/anaconda3/envs/pt1.3-py3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/unaguo/anaconda3/envs/pt1.3-py3.6/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py", line 429, in _worker_fn
    batch = batchify_fn([_worker_dataset[i] for i in samples])
  File "/home/unaguo/anaconda3/envs/pt1.3-py3.6/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py", line 429, in <listcomp>
    batch = batchify_fn([_worker_dataset[i] for i in samples])
  File "/data2/enducation/paper_recog_total/train-paper-recog/line_detect/data/paper_dataset.py", line 375, in __getitem__
    data_dict = self._transforms(data_dict)
  File "/data2/enducation/paper_recog_total/train-paper-recog/line_detect/data/transforms_paper.py", line 13, in __call__
    args = trans(args)
  File "/data2/enducation/paper_recog_total/train-paper-recog/line_detect/data/transforms_paper.py", line 468, in __call__
    dst_points = np.array([[rdw(), rdh()], [w-1-rdw(), rdh()], [w-1-rdw(), h-1-rdh()], [rdw(), h-1-rdh()]])
  File "/data2/enducation/paper_recog_total/train-paper-recog/line_detect/data/transforms_paper.py", line 466, in <lambda>
    rdh = lambda: np.random.randint(0, self.max_affine_xy_ratio * h)
  File "mtrand.pyx", line 746, in numpy.random.mtrand.RandomState.randint
  File "_bounded_integers.pyx", line 1254, in numpy.random._bounded_integers._rand_int64
ValueError: low >= high
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data2/enducation/paper_recog_total/train-paper-recog/line_detect/scripts/train_gluon_testpaper.py", line 233, in <module>
    for batch_cnt, data_batch in enumerate(tqdm.tqdm(train_loader)):
  File "/home/unaguo/anaconda3/envs/pt1.3-py3.6/lib/python3.6/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/home/unaguo/anaconda3/envs/pt1.3-py3.6/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py", line 484, in __next__
    batch = pickle.loads(ret.get(self._timeout))
  File "/home/unaguo/anaconda3/envs/pt1.3-py3.6/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
ValueError: low >= high

解决思路:
将mx.gluon.data.DataLoader中修改thread_pool=True,什么意思呢?
If True, use threading pool instead of multiprocessing pool. Using threadpool can avoid shared memory usage. If DataLoader is more IO bounded or GIL is not a killing problem, threadpool version may achieve better performance than multiprocessing.
翻译:如果True,则使用线程池而不是多处理池。使用线程池可以避免共享内存的使用。如果“DataLoader”的IO范围更大,或者GIL不是致命的问题是,线程池版本可能实现比多处理更好的性能。

train_loader = mx.gluon.data.DataLoader(train_dataset, batch_size=config.TRAIN.batch_size,
                                            shuffle=True, num_workers=2, thread_pool=True,
                                            last_batch="discard", batchify_fn=batch_fn)
  • 3
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值