报错:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
这是因为我使用torch.nn.CrossEntropyLoss的计算有误,类别个数没对上
pred = torch.zeros((128, 1, 128, 128))
labels = torch.zeros((128, 128, 128))
loss_fn = torch.nn.CrossEntropyLoss()
loss = loss_fn(pred, labels)
loss.backward() <--此处会报错
解决办法:
pred = torch.zeros((128, 1, 128, 128))
labels = torch.zeros((128, 128, 128))
loss_fn = torch.nn.SmoothL1Loss()
loss = loss_fn(pred.view(128, 128, 128), labels)
loss.backward()
只是说明问题所在,并非真的直接改CE为L1Loss,具体看你的实际情况。
另外,这个报错RuntimeError不是很清晰,并非只有我这一种情况会出现这种报错,其他情况下的报错可以跟我交流。