第一个坑,遇到了一个这样的错误,terminate called after throwing an instance of ‘c10::Error’
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: device-side assert triggered (record at /pytorch/aten/src/ATen/cuda/CUDAEvent.h:116)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7fcf6c0ff193 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x4595c64 (0x7fcf70ae7c64 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #2: <unknown function> + 0x145922c (0x7fcf6d9ab22c in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x458bf0b (0x7fcf70addf0b in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #4: <unknown function> + 0x1c1e321 (0x7fcf6e170321 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #5: at::native::lstm(at::Tensor const&, c10::ArrayRef<at::Tensor>, c10::ArrayRef<at::Tensor>, bool, long, double, bool, bool, bool) + 0x254 (0x7fcf6e1546d4 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #6: <unknown function> + 0x2043fd9 (0x7fcf6e595fd9 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #7: <unknown function> + 0x3dec0d5 (0x7fcf7033e0d5 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #8: <unknown function> + 0x2048840 (0x7fcf6e59a840 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #9: <unknown function> + 0x35f5f2 (0x7fcfb71f05f2 in /home/stu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #44: __libc_start_main + 0xe7 (0x7fcfca2fdb97 in /lib/x86_64-linux-gnu/libc.so.6)
Aborted (core dumped)
我看网上针对对这个报错,只给了一个不知道怎么回事的解决方法,我只说说我最后是怎么解决的吧!
其实就是我的nn.Embedding(size,char_len)中的size写小了,导致的报错。
第二个坑:训练效果无变化
训练之后的时候,效果没有任何的变化,然后 parameter.requires_grad为True.
loss_func = nn.NLLLoss().to(device)
改为
loss_func = nn.NLLLoss()