pypytorch debug记录: MemoryError、BrokenPipeError、device-side assert triggered


操作系统:windows 10

一、MemoryError、BrokenPipeError

error_message:

99%|█████████████████████████████████████████████████████████████████████████████████████████
100%|█████████████████████████████████████████████████████████████████████████████████████████
100%|█████████████████████████████████████████████████████████████████████████████████████████
███████████████████████████████████████████████| 196718/196718 [00:23<00:00, 8383.44it/s]
Load Data Done
Initial model...
Initial model Done
Start Train...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
MemoryError
Traceback (most recent call last):
  File "train.py", line 179, in <module>
    train(model, train_iter, optimizer, criterion, device)
  File "train.py", line 25, in train
    for i, batch in enumerate(iterator):
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 291, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 737, in __init__
    w.start()
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

解决方法

data.DataLoader(dataset=eval_dataset,
                                    batch_size=args.batch_size,
                                    shuffle=False,
                                    # num_workers=4,
                                    num_workers=0,
                                    collate_fn=pad,
                                    drop_last=True
                                )

设置num_workers=0,

资料

PyTorch问题集:BrokenPipeError: [Errno 32] Broken pipe
https://blog.csdn.net/weixin_43002433/article/details/104888766

BrokenPipeError: [Errno 32] Broken pipe
https://blog.csdn.net/qq_33666011/article/details/81873217

二、CUDA error: device-side assert triggered

Assertion srcIndex < srcSelectDimSize failed

error_message:
第一次

 98%|█████████████████████████████████████████████████████████████████████████████████████████
 99%|█████████████████████████████████████████████████████████████████████████████████████████
 99%|█████████████████████████████████████████████████████████████████████████████████████████
100%|█████████████████████████████████████████████████████████████████████████████████████████
██████████████████████████████████████████████| 196718/196718 [00:18<00:00, 10806.56it/s]
Load Data Done
Initial model...
Initial model Done
Start Train...
2020-10-24T13:04:56.860248 step: 0, loss: 10809.17
2020-10-24T13:06:01.654685 step: 50, loss: 9954.39
2020-10-24T13:07:08.105712 step: 100, loss: 9925.17
2020-10-24T13:08:14.760986 step: 150, loss: 9730.56
2020-10-24T13:09:18.523595 step: 200, loss: 9885.58
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [4,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [11,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [12,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [13,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [14,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [15,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [16,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [17,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [18,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [19,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [20,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [21,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [22,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [23,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [195,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [4,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [11,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [12,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [13,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [14,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [15,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [16,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [17,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [18,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [19,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [20,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [21,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [22,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [23,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [182,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [48,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [49,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [50,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [51,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [375,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [64,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [65,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [66,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [67,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [68,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [69,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [70,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [71,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [72,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [73,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [74,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [75,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [76,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [77,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [78,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [79,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [80,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [81,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [82,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [83,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [84,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [85,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [86,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [87,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [88,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [89,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [90,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [91,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [93,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [94,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:272: block: [363,0,0], thread: [95,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
  File "train.py", line 185, in <module>
    train(model, train_iter, optimizer, criterion, device)
  File "train.py", line 34, in train
    loss = model.neg_log_likelihood(x, y) # logits: (N, T, VOCAB), y: (N, T)
  File "D:\programing\Bert-BiLSTM-CRF-pytorch\Bert-BiLSTM-CRF-pytorch-bidder\crf.py", line 151, in neg_log_likelihood
    feats = self._get_lstm_features(sentence)  #[batch_size, max_len, 16]
  File "D:\programing\Bert-BiLSTM-CRF-pytorch\Bert-BiLSTM-CRF-pytorch-bidder\crf.py", line 160, in _get_lstm_features
    embeds = self._bert_enc(sentence)  # [8, 75, 768]
  File "D:\programing\Bert-BiLSTM-CRF-pytorch\Bert-BiLSTM-CRF-pytorch-bidder\crf.py", line 109, in _bert_enc
    encoded_layer, _  = self.bert(x)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 733, in forward
    output_all_encoded_layers=output_all_encoded_layers)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 406, in forward
    hidden_states = layer_module(hidden_states, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 391, in forward
    attention_output = self.attention(hidden_states, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 349, in forward
    self_output = self.self(input_tensor, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 309, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA error: device-side assert triggered

第二次:

2020-10-24T18:56:04.041860 step: 3350, loss: 8741.92
2020-10-24T18:56:29.927092 step: 3400, loss: 8753.03
2020-10-24T18:56:55.600663 step: 3450, loss: 8723.60
2020-10-24T18:57:21.146033 step: 3500, loss: 8743.93
Traceback (most recent call last):
  File "train.py", line 187, in <module>
    train(model, train_iter, optimizer, criterion, device)
  File "train.py", line 35, in train
    loss = model.neg_log_likelihood(x, y) # logits: (N, T, VOCAB), y: (N, T)
  File "D:\programing\Bert-BiLSTM-CRF-pytorch\Bert-BiLSTM-CRF-pytorch-bidder\crf.py", line 152, in neg_log_likelihood
    feats = self._get_lstm_features(sentence)  #[batch_size, max_len, 16]
  File "D:\programing\Bert-BiLSTM-CRF-pytorch\Bert-BiLSTM-CRF-pytorch-bidder\crf.py", line 161, in _get_lstm_features
    embeds = self._bert_enc(sentence)  # [8, 75, 768]
  File "D:\programing\Bert-BiLSTM-CRF-pytorch\Bert-BiLSTM-CRF-pytorch-bidder\crf.py", line 110, in _bert_enc
    encoded_layer, _  = self.bert(x)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 733, in forward
    output_all_encoded_layers=output_all_encoded_layers)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 406, in forward
    hidden_states = layer_module(hidden_states, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 391, in forward
    attention_output = self.attention(hidden_states, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 349, in forward
    self_output = self.self(input_tensor, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 310, in forward
    attention_scores = attention_scores / math.sqrt(self.attention_head_size)
RuntimeError: CUDA error: device-side assert triggered

(Bert-BiLSTM-CRF-pytorch) D:\programing\Bert-BiLSTM-CRF-pytorch\Bert-BiLSTM-CRF-pytorch-bidder>
                                                                                            result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 733, in forward
    output_all_encoded_layers=output_all_encoded_layers)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 406, in forward
    hidden_states = layer_module(hidden_states, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 391, in forward
    attention_output = self.attention(hidden_states, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 349, in forward
    self_output = self.self(input_tensor, attention_mask)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\main\Anaconda3\envs\Bert-BiLSTM-CRF-pytorch\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 310, in forward
    attention_scores = attention_scores / math.sqrt(self.attention_head_size)
RuntimeError: CUDA error: device-side assert triggered

在程序开始运行处设置

CUDA_LAUNCH_BLOCKING="1"

关闭主机和设备间的异步执行,获取更多的输出信息

思路

搜索的资料上有说是索引超出范围

例1:pytorch 报错
例2:[解决] Assertion srcIndex < srcSelectDimSize failed.

尝试

操作1:
修改batch_size的大小,重新开始训练

资料

CUDA error 59: Device-side assert triggered

Extra tip

The error messages you get when running into this error may not be very descriptive. To make sure you get the complete and useful stack trace, have this at the very beginning of your code and run it before anything else:

CUDA_LAUNCH_BLOCKING=“1”

CUDA编程接口:异步并发执行的概念和API

CUDA ERROR: device-side assert triggered问题解决思路

三、Expected all tensors to be on the same device, but found at least two device

模型训练时传入设备的tensor要在同一设备上,这是因为同时使用GPU、CPU导致的,修改为只用GPU即可

  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值