完整报错 Inconsistency detected by ld.so: ../sysdeps/x86_64/dl-machine.h: 541: elf_machine_rela_relative: Assertion `ELFW(R_TYPE) (reloc->r_info) == R_X86_64_RELATIVE' failed!
进到python排查了一下,发现是torch包崩坏了
全网都找不到靠谱的解决方案,根据这个stackOverflow上的帖子成功解决
总结就是:
1. 重装torch
2.遇到类似报错:
Channels:
- pytorch
- nvidia
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed
LibMambaUnsatisfiableError: Encountered problems while solving:
- package pytorch-cuda-12.1-ha16c6d3_5 requires libcublas >=12.1.0.26,<12.1.3.1, but none of the providers can be installed
Could not solve for environment specs
The following packages are incompatible
├─ libcublas 11.10.3.66.* is requested and can be installed;
└─ pytorch-cuda 12.1** is not installable because it requires
└─ libcublas >=12.1.0.26,<12.1.3.1 , which conflicts with any installable versions previously reported.
3.通过这样解决:conda install nvidia/label/cuda-12.1.0::NAMEHERE --solver=libmamba
此处NAMEHERE替换上图中的libcublas,后续重复步骤2和3,几次之后就好了