安装chainer,以及conda安装cudatoolkit、cupy
1. 安装方式
chainer是一个深度学习框架,如果想使用GPU进行加速计算,必须在装chainer之前装cupy。为了方便,推荐使用conda安装cupy。
加入你已经装好了conda,cuda和它对应的驱动(nvidia-smi查看GPU版本和驱动)也安装好了。那么我们开始吧!
第一步:创建名为chainer的conda环境
conda create -n chainer python==3.7
第二步:conda安装cupy和cudatoolkit
conda install -c conda-forge cupy cudatoolkit=10.1
Cupy的更多安装细节可查看官方文档:https://docs.cupy.dev/en/latest/install.html#faq
查看cupy安装是否成功,没有报错则说明成功安装。
python
import cupy
第三步:安装chainer
pip install chainer
chainer的更多安装细节可查看:https://docs.chainer.org/en/stable/install.html
检查chainer是否确实支持 CUDA/cuDNN
python
# 如果chainer成功导入cupy:True
print(chainer.backends.cuda.available)
# 如果 cuDNN 支持可用: True
print(chainer.backends.cuda.cudnn_enabled)
2. 问题和解决办法
2.1 问题:cupy安装失败
>>> import cupy
Traceback (most recent call last):
File "/public/home/jd_shb/miniconda3/envs/chainer/lib/python3.7/site-packages/cupy/__init__.py", line 16, in <module>
from cupy import _core # NOQA
File "/public/home/jd_shb/miniconda3/envs/chainer/lib/python3.7/site-packages/cupy/_core/__init__.py", line 1, in <module>
from cupy._core import core # NOQA
File "cupy/_core/core.pyx", line 1, in init cupy._core.core
File "/public/home/jd_shb/miniconda3/envs/chainer/lib/python3.7/site-packages/cupy/cuda/__init__.py", line 8, in <module>
from cupy.cuda import compiler # NOQA
File "/public/home/jd_shb/miniconda3/envs/chainer/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 12, in <module>
from cupy.cuda import function
File "cupy/cuda/function.pyx", line 1, in init cupy.cuda.function
File "cupy/cuda/texture.pyx", line 1, in init cupy.cuda.texture
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/public/home/jd_shb/miniconda3/envs/chainer/lib/python3.7/site-packages/cupy/__init__.py", line 37, in <module>
raise ImportError(_msg) from e
ImportError: CuPy is not correctly installed.
If you are using wheel distribution (cupy-cudaXX), make sure that the version of CuPy you installed matches with the version of CUDA on your host.
Also, confirm that only one CuPy package is installed:
$ pip freeze
If you are building CuPy from source, please check your environment, uninstall CuPy and reinstall it with:
$ pip install cupy --no-cache-dir -vvvv
Check the Installation Guide for details:
https://docs.cupy.dev/en/latest/install.html
original error: libcuda.so.1: cannot open shared object file: No such file or directory
错误原因: 找不到libcuda.so.1文件。该文件是GPU的驱动版本对应的so文件。因此,我们需要查找DUDA的安装路径,查看是否有该文件,然后配置CUDA_PATH和LD_LIBRARY_PATH的环境变量。
2.2 解决办法
因为,我是在GXU的超算集群上安装的,所以我在/public/software/cuda-10.0/lib64/stubs/目录下找到了libcuda.so文件,查询命令如下:
第一步:查找libcuda.so.1文件
find /public/software/cuda-10.0/lib64/ -name libcuda.so
此文件和libcuda.so.1文件没有区别, 但是cupy丢失的是libcuda.so.1,所以我们做一个给libcuda.so文件做个软链接(类似于window的快捷方式),。我把软链接放到了chainer虚拟环境的lib目录下。
第二步:创建软链接:
# ln -s old-file-path new-file-path
ln -s /public/software/cuda-10.0/lib64/stubs/libcuda.so /public/home/jd_yangfeng/anaconda3/envs/chainer/lib/libcuda.so.1
创建好之后,我们就可以看到chainer虚拟环境的lib目录下有libcuda.so.1文件了。
第三步:添加环境变量
打开 ~/.bashrc文件,添加以下内容。CUDA_PATH是CUDA的安装路径,LD_LIBRARY_PATH是libcuda.so.1的文件路径。
export CUDA_PATH="/public/software/cuda-10.0/bin:$CUDA_PATH"
export LD_LIBRARY_PATH="/public/home/jd_yangfeng/anaconda3/envs/chainer/lib:$LD_LIBRARY_PATH "
source执行刚修改的初始化文件,使之立即生效,而不必注销并重新登录:
source .bashrc
第三步环境变量的设置可参考该链接:https://blog.csdn.net/u011440558/article/details/84291988
最后,问题解决!import cupy不报错,chainer已经成功导入cupy!
参考资料
- cupy官方文档:https://docs.chainer.org/en/stable/install.html
- cupy的官方社区:https://github.com/conda-forge/cupy-feedstock
- chainer官方文档:https://docs.cupy.dev/en/latest/install.html#faq
- Linux的LD_LIBRARY_PATH 变量设置:https://blog.csdn.net/u011440558/article/details/84291988
在安装Cupy出错时,十分感谢Cupy官方社区的热心解答。如果有问题,可以上官方社区提issue哦!