AlphaFold 构建报错汇总
问题1(Dockerfile)
在apt-get update命令处提示:
Err:13 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu1804/x86_64 InRelease
Temporary failure resolving ‘developer.download.nvidia.cn’
Err:14 https://developer.download.nvidia.cn/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease
Temporary failure resolving ‘developer.download.nvidia.cn’
并且紧接着造成连锁错误:
W: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/InRelease Temporary failure resolving ‘developer.download.nvidia.cn’
W: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/InRelease Temporary failure resolving ‘developer.download.nvidia.cn’
W: Some index files failed to download. They have been ignored, or old ones used instead.
解决
解决方案 - 修改DNS
修改docker的守护参数
sudo nano /etc/docker/daemon.json
追加dns相关配置
"dns":["114.114.114.114","8.8.8.8","119.29.29.29","223.5.5.5"]
sudo systemctl daemon-reload
sudo systemctl restart docker
然后重新运行构建命令
docker build -f docker/Dockerfile -t alphafold .
问题解决。
问题2(Dockerfile)
docker在19.03版本之后,可以不用安装nvidia-docker,就能获得GPU的计算支持。
但是,直接在docker中使用GPU设备,出现报错如下:
docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]].
这个问题可以通过安装nvidia-container-toolkit来解决。
1、添加nvidia-docker的源
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
如果不添加上面的源,可能会报如下错误:
E: Unable to locate package nvidia-container-toolkit
2、安装nvidia-container-toolkit
sudo apt-get install -y nvidia-container-toolkit
3、重启docker
sudo systemctl restart docker
问题3(非Dockerfile)
使用官方的jaxlib会报错:AttributeError: module ‘jaxlib.pocketfft’ has no attribute ‘pocketfft’
或者
RuntimeError: jaxlib version 0.3.25 is newer than and incompatible with jax version 0.3.17. Please update your jax and/or jaxlib packages.
解决:
pip install --upgrade jax== 0.2.24 jaxlib==0.1.76 dm-haiku=0.0.4
问题4(非Dockerfile)
如果输出内容里出现:CUDA - Error computing forces with CUDA platform,原因在于cudatookit不对。
解决:
首先使用 nvidia-smi 查看CUDA Version,然后使用 conda install -c conda-forge cudatoolkit= CUDA Version(对应的版本号),就可以解决问题。当然没有cuda也能跑起来,只是费时间。
问题5(非Dockerfile)
ModuleNotFoundError: No module named ‘pdbfixer’
解决
conda install -c conda-forge pdbfixer
问题6(非Dockerfile)
ModuleNotFoundError: No module named ‘mock’
解决:
pip install mock
问题7(非Dockerfile)
ValueError: Could not find path to the “jackhmmer” binary. Make sure it is installed on your system.
解决
conda install -c biocore hmmer
ValueError: Could not find path to the “kalign” binary. Make sure it is installed on your system.
解决:
conda install -c bioconda kalign3
问题8 hmmsearch failed: Parse failed (sequence file /path/pdb_seqres/pdb_seqres.txt): Line 1360234: illegal character 0
issue解决: https://github.com/deepmind/alphafold/issues/569
awk 'NR%2==0' pdb_seqres.txt | grep 0
tac pdb_seqres.txt | sed '/^CT05/,+1d' | tac > pdb_seqres_fixed.txt