使用自己的数据集训练SwinTransformer时交叉熵出现：RuntimeError: CUDA error: an illegal memory access was encountered

最新推荐文章于 2024-05-21 11:33:10 发布

ヤ。。。＆♚

最新推荐文章于 2024-05-21 11:33:10 发布

阅读量178

点赞数 1

文章标签：深度学习机器学习人工智能 python transformer bug

本文链接：https://blog.csdn.net/qq_52060635/article/details/134148448

版权

本人在训练SwinTransformer时，当换用自己的数据集遇到问题：

RuntimeError: CUDA error: an illegal memory access was encountered

网上的方法有很多，诸如：

【debug】报错RuntimeError: CUDA error: an illegal memory access was encountered_zy_destiny的博客-CSDN博客

[彻底解决]CUDA error: an illegal memory access was encountered(CUDA错误非法访问内存) - 知乎 (zhihu.com)

如何解决一个诡异的pytorch的illegal memory access报错？ - 知乎 (zhihu.com)

pytorch报错：CUDA error: an illegal memory access was encountered-CSDN博客

都进行了操作无效，最后回去代码debug，发现是loss函数出现问题，开始一行代码一行代码找，发现是mmseg/models/losses出现问题，问题点定位为ignore_index=self.ignore_index)，发现这篇博客有人提及【精选】记录使用mmseg时在计算交叉熵损失遇到的RuntimeError问题与解决方案_correct = correct[:, target != ignore_index]_FALALILA的博客-CSDN博客

说label的问题，然后回去对标签处理，发现将图片直接转换为灰度图像即可解决。

具体操作如下：

resized_image =image.resize((512, 512),0)

解决问题！！！！创作不易，感谢诸多博主的高水平文章帮助，有用的同学欢迎━(*｀∀´*)ノ亻!点赞收藏！

具体报错：

Traceback (most recent call last): File "train.py", line 166, in <module> main() File "train.py", line 162, in main meta=meta) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 133, in run iter_runner(iter_loaders[i], **kwargs) File "/home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 60, in train outputs = self.model.train_step(data_batch, self.optimizer, **kwargs) File "/home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/segmentors/base.py", line 153, in train_step losses = self(**data_batch) File "/home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func return old_func(*args, **kwargs) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/segmentors/base.py", line 123, in forward return self.forward_train(img, img_metas, **kwargs) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/segmentors/encoder_decoder.py", line 158, in forward_train gt_semantic_seg) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/segmentors/encoder_decoder.py", line 102, in _decode_head_forward_train self.train_cfg) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/decode_heads/decode_head.py", line 187, in forward_train losses = self.losses(seg_logits, gt_semantic_seg) File "/home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func return old_func(*args, **kwargs) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/decode_heads/decode_head.py", line 232, in losses ignore_index=self.ignore_index) File "/home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/losses/cross_entropy_loss.py", line 197, in forward **kwargs) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/losses/cross_entropy_loss.py", line 30, in cross_entropy loss, weight=weight, reduction=reduction, avg_factor=avg_factor) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/losses/utils.py", line 48, in weight_reduce_loss loss = reduce_loss(loss, reduction) File "/home/sd4t/why/workplace/swinTransformer/Swin-Transformer-Semantic-Segmentation-main/mmseg/models/losses/utils.py", line 22, in reduce_loss return loss.mean() RuntimeError: CUDA error: an illegal memory access was encountered terminate called after throwing an instance of 'c10::Error' what(): CUDA error: an illegal memory access was encountered Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:733 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7fcc84a542f2 in /home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5b (0x7fcc84a5167b in /home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/lib/libc10.so) frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x809 (0x7fcc84e151f9 in /home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/lib/libc10_cuda.so) frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7fcc84a3c3a4 in /home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/lib/libc10.so) frame #4: <unknown function> + 0x6e9eca (0x7fcc23ae9eca in /home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #5: <unknown function> + 0x6e9f71 (0x7fcc23ae9f71 in /home/sd4t/why/anaconda3/envs/myswint/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #6: python() [0x4cb472] frame #7: python() [0x4a0a87] frame #8: python() [0x4b5cfb] frame #9: python() [0x4b5cfb] frame #10: python() [0x4b0858] frame #11: python() [0x4c5b50] frame #12: python() [0x4c5b66] frame #13: python() [0x4c5b66] frame #14: python() [0x4c5b66] frame #15: python() [0x4c5b66] frame #16: python() [0x4c5b66] frame #17: python() [0x4946f7] <omitting python frames> frame #21: python() [0x53fc79] frame #23: <unknown function> + 0x29d90 (0x7fcc9e3a9d90 in /lib/x86_64-linux-gnu/libc.so.6) frame #24: __libc_start_main + 0x80 (0x7fcc9e3a9e40 in /lib/x86_64-linux-gnu/libc.so.6) frame #25: python() [0x53f9ee]

ヤ。。。＆♚

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
使用自己的数据集训练SwinTransformer时交叉熵出现：RuntimeError: CUDA error: an illegal memory access was encountered

解决！！！RuntimeError: CUDA error: an illegal memory access was encountered
复制链接

扫一扫

使用自己的数据集训练SwinTransformer时交叉熵出现：RuntimeError: CUDA error: an illegal memory access was encountered

“相关推荐”对你有帮助么？