【DeepLearning笔记】conda高频命令及非root用户下cuda配置建议

最新推荐文章于 2024-07-17 10:08:50 发布

six_gods

最新推荐文章于 2024-07-17 10:08:50 发布

阅读量580

点赞数 2

分类专栏： python 🔧技巧 🏭系统文章标签： pytorch 人工智能

本文链接：https://blog.csdn.net/six_gods/article/details/112389107

版权

python 同时被 3 个专栏收录

10 篇文章 0 订阅

订阅专栏

🔧技巧

5 篇文章 0 订阅

订阅专栏

🏭系统

4 篇文章 0 订阅

订阅专栏

conda高频命令及非root用户下cuda配置建议，建议收藏

最近在调试模型时发现总会出现各种cuda版本问题，但网络中教程建议使用软连接等方式建立多cuda环境，并不好用，因此总结这一篇博文，结尾有彩蛋～

conda常用命令

本文第一部分主要介绍conda基础命令

conda基础命令

新建环境名为env_name的版本为#.#的环境：

conda create -n env_name python=#.#

删除env_name环境：

conda remove -n env_name --all

复制old_env_neme环境为new_env_neme：

conda create -n new_env_name --clone old_env_name

进入env_name环境：

conda activate env_name

退出环境：

conda deactivate

显示所有环境：

conda env list

检索tool包的可安装版本：

conda search ${tool}

安装tool包：

conda install ${tool}

列举当前环境下的依赖包：

conda list

conda实用命令

复制环境

conda create -n BBB --clone AAA

远程环境分享可能会使用到压缩环境
创建环境分享包，压缩环境到environment.yml：

conda env export > environment.yml

安装分享环境：

conda env create -f environment.yml

无法安装环境可能是因为安装路径的问题
可以删除尝试.condarc

删除没有用的包

conda clean -p

删除tar包

conda clean -t

删除所有的安装包及cache

conda clean -y -all

辅助环境到无conda环境

主机如果有conda环境

conda install -c conda-forge conda-pack

没有conda环境，也可从Pypi安装

pip install conda-pack

以上都没有，也可从源码安装

pip install git+https://github.com/conda/conda-pack.git

现在呢，开发机上已经部署好了环境，完成了开发工作，需要把开发机的python环境迁移到生产机，那么需要这么做

在开发机上（根据需求三选一）

# 把虚拟环境 my_env 打包为 my_env.tar.gz
$ conda pack -n my_env

# 把虚拟环境 my_env 打包为 out_name.tar.gz
$ conda pack -n my_env -o out_name.tar.gz

# 把某个特定路径的虚拟环境打包为 my_env.tar.gz
$ conda pack -p /explicit/path/to/my_env

在生产机上

# Unpack environment into directory `my_env`
$ mkdir -p my_env
$ tar -xzf my_env.tar.gz -C my_env

# Use python without activating or fixing the prefixes. Most python
# libraries will work fine, but things that require prefix cleanups
# will fail.
$ ./my_env/bin/python

# Activate the environment. This adds `my_env/bin` to your path
$ source my_env/bin/activate

# Run python from in the environment
(my_env) $ python

# Cleanup prefixes from in the active environment.
# Note that this command can also be run without activating the environment
# as long as some version of python is already installed on the machine.
(my_env) $ conda-unpack

# At this point the environment is exactly as if you installed it here
# using conda directly. All scripts should work fine.
(my_env) $ ipython --version

# Deactivate the environment to remove it from your path
(my_env) $ source my_env/bin/deactivate

安装适合torch版本的cuda

查看cuda版本

nvcc -V

安装cuda8.0

conda install cudatoolkit=8.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/linux-64/

一般安装cuda的同时会自动安装cudnn7.0.5，如果没有自动安装，可以手动安装

conda install cudnn=7.0.5 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/

cuda、cudnn对应版本表
cuda
在这里插入图片描述

conda安装torch

conda命令安装(网络可能会链接不上，推荐使用pip安装)

CUDA=cuda92 # cpu、cuda92、cuda100或cuda101

conda install pytorch=1.4.0 torchvision ${CUDA} -c pytorch

pip安装torch

CUDA=cu92 # cpu、cu92、cu100或cu101

pip install torch==1.4.0+${CUDA} torchvision==0.5.0+${CUDA} -f https://download.pytorch.org/whl/torch_stable.html

测试是否可以使用cuda

import torch
print(torch.__version__)#.查看torch的版本 1.4.0
print(torch.version.cuda)#查看torch对应的cuda的版本 9.2
#cat /usr/local/cuda/version.txt  #查看cuda的版本CUDA Version 9.2
print(torch.cuda.is_available())#查看cuda是否可用 True or Fals

安装tensorflow-gpu

tensorflow和cuda的版本对应关系
在这里插入图片描述

pip安装tensorflow-gpu

pip install Tensorflow-gpu==1.8.0

conda安装tensorflow-gpu

conda install Tensorflow-gpu==1.8.0

因为版本限制，可能无法找到制定版本，可尝试1.15.0

pip install tensorflow-gpu==1.15.0

测试tensorflow-gpu是否可用

import tensorflow as tf
hello=tf.constant('hello,world')
sess=tf.Session()
print(sess.run(hello))

import tensorflow as tf
print(tf.test.is_gpu_available())

指定gpu训练

import os
os.environ['CUDA_VISIBLE_DEVICES']='2'

指定下载镜像

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/

conda config --set show_channel_urls yes

conda install cudatoolkit=8.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/linux-64/

cudnn:conda install cudnn=7.0.5 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/

conda install pytorch=0.3.0 torchvision=0.2.0 -c soumith

conda安装torch常用组合件

torch_geometric、torch_sparse、torch_scatter是运行pytorch模型时经使用到的辅助依赖包，这几个班的版本之间经常产生冲突，因此在这里特地进行说明。

CUDA=cu92 # cpu、cu92、cu100或cu101

安装torch-scatter

pip install torch-scatter==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-1.4.0.html

安装torch-sparse

pip install torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-1.4.0.html

安装torch-cluster

pip install torch-cluster==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-1.4.0.html

安装torch-spline-conv

pip install torch-spline-conv==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-1.4.0.html

安装torch-geometric

pip install torch-geometric

若torch-geometric仍报错，可能需要手动升级或者降级

pip install torch_geometric==1.4.1

安装torch_geometric可能会遇见sklearn版本问题，It seems that scikit-learn has not been built correctly.

pip install scikit-learn==0.20.3

torch_geometric1.4.1及以上版本需要scikit-image包，但是这个包容报错，可以使用torch_geometric1.3.9，但容易出现scatter_add参数不一致问题
最好先安装scikit-learn、scikit-image，再安装torch_geometric

公用环境下cuda配置

在公用环境下，用户往往没有root权限，无法在/usr/local/下建立软链接，这个真的无解。调试一天后，果断放弃，虽然网上说的方法很多，但是都无法使用：）））），大家还是修改代码吧，祝大家bug愉快～

实用小技巧，只要是权限足够，环境变量可以自己修改，那么使用conda install直接安装cudatoolkit、cudnn一定是可以成功的，如果conda安装后无效，则是权限不够，就放弃修改吧：）

若非公用环境下，按文件名检索cuda工具

find / -name 'conda-10'

或按文件内容cuda检索

find . -type f |bai xargs grep -l 'cuda'

找到cuda文件后，对其建立软链接即可

rm -rf /usr/local/cuda   #删除之前创建的软链接
sudo ln -s /usr/local/cuda-10.1 /usr/local/cuda

最终查看是否完成修改，finish～

nvcc --version    #查看当前 cuda 版本

如果有疑问欢迎留言，我看到后会及时回复哒：）

参考网页
pytorch下载官网
 pytorch官网依赖包
 pytorch附加运算依赖包下载地址（如torch-sactter）
清华依赖包镜像
 torch对应的torchvision版本
 CUDA和CUDNN版本对应关系
 加州大学依赖包

six_gods

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【DeepLearning笔记】conda高频命令及非root用户下cuda配置建议

conda常用命令与公用环境下cuda配置最近在调试模型时发现总会出现各种cuda本文问题，但网络中教程建议使用软连接等方式建立多cuda环境，并不好用，因此总结这一篇博文，结尾有彩蛋～conda常用命令conda基础命令新建环境：conda create -n env_name python=#.#删除环境：conda remove -n env_name --all复制环境：conda create -n new_env_name --clone old_env_neme进入环
复制链接

扫一扫

专栏目录