1 undefined symbol: iJIT_NotifyEvent
Traceback (most recent call last):
File "/mnt/data4/home/xxx/GPAvatar/inference.py", line 11, in <module>
import torch
File "/mnt/data/home/xxx/miniforge3/envs/GPAvatar/lib/python3.9/site-packages/torch/__init__.py", line 229, in <module>
from torch._C import * # noqa: F403
ImportError: /mnt/data/home/xxx/miniforge3/envs/GPAvatar/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent
在以下环境时 import torch
发生的bug:
mamba install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
mamba install nvidia/label/cuda-11.8.0::cuda-toolkit -c nvidia/label/cuda-11.8.0
此时的 mkl=2024.2.2
,我们将其降级为 mkl=2024.0
mamba install mkl==2024.0
参考:https://github.com/pytorch/pytorch/issues/123097
2 CUDA error: CUBLAS_STATUS_NOT_SUPPORTED
使用一下官方命令安装 pytorch=1.11.0+cuda=11.3
mamba install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
运行 torch.einsum
这个函数时,出现了以下报错:
File "/mnt/data4/home/wangjiawei/eg3d/dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch/models/bfm.py", line 96, in compute_shape
id_part = torch.einsum('ij,aj->ai', self.id_base, id_coeff)
File "/mnt/data/home/wangjiawei/miniforge3/envs/deep3d_pytorch/lib/python3.8/site-packages/torch/functional.py", line 330, in einsum
return _VF.einsum(equation, operands) # type: ignore[attr-defined]
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)`
解决方案:
换成 pip 安装 pytorch(感觉是因为这个老版本的 Conda 安装 cudatoolkit 方法有问题,到了pytorch v1.13.0 支持 CUDA 11.7 之后,就是安装 pytorch-cuda 了,应该就没问题了)
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113