【深度学习】搭建一个docker环境,封装detectron2的分割网络服务3080TI cuda11.1


获取cuda docker

docker 镜像list: https://hub.docker.com/r/nvidia/cuda
找到自己合适的镜像
拉取docker

sudo docker pull nvidia/cuda:11.1.1-cudnn8-devel-centos7

启动docker 验证环境是否work

 sudo docker run -it --name mydet2 --rm --gpus all  nvidia/cuda:11.1.1-cudnn8-devel-centos7 nvidia-smi -l
Wed Oct 26 09:47:25 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.63.01    Driver Version: 470.63.01    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:04:00.0 Off |                  N/A |
| 33%   35C    P0    88W / 350W |      0MiB / 12053MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:0B:00.0 Off |                  N/A |
| 33%   35C    P0     1W / 350W |      0MiB / 12053MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

nvidia的卡在docker环境下是可用的。

在mydet2 container中下载相关torch和dectron2的框架

先安装anaconda,并切换源

版本的选择参见这里
https://blog.csdn.net/weixin_40293999/article/details/127377288

一个小问题

此时docker的date时区都是错的

(base) [root@fc52f21b15e6 /]# date
Wed Oct 26 10:03:03 UTC 2022

索性安装vim,wget插件,把yum源换了
切换yum源要用wget
先安装wget

yum install wget
Loaded plugins: fastestmirror, ovl
Loading mirror speeds from cached hostfile



 * base: mirrors.huaweicloud.com
 * extras: mirrors.tuna.tsinghua.edu.cn
 * updates: mirrors.tuna.tsinghua.edu.cn
base                                                                                                                                                   | 3.6 kB  00:00:00
https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/repodata/repomd.xml: [Errno 14] HTTPS Error 301 - Moved Permanently
Trying other mirror.


 One of the configured repositories failed (cuda),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Run the command with the repository temporarily disabled
            yum --disablerepo=cuda ...

     4. Disable the repository permanently, so yum won't use it by default. Yum
        will then just ignore the repository until you permanently enable it
        again or use --enablerepo for temporary usage:

            yum-config-manager --disable cuda
        or
            subscription-manager repos --disable=cuda

     5. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=cuda.skip_if_unavailable=true

failure: repodata/repomd.xml from cuda: [Errno 256] No more mirrors to try.
https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/repodata/repomd.xml: [Errno 14] HTTPS Error 301 - Moved Permanently


解决方案:mv cuda.repo cuda.repo.bk
再次yum install wget 搞定

切换yum源指引在这里:https://blog.csdn.net/MateSnake/article/details/124088310

解决时区问题:
docker 外面
sudo docker cp /usr/share/zoneinfo/ mydet2:/usr/share/
docker 内部:
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo “Asia/Shanghai” > /etc/timezone

验证时间

date

(base) [root@fc52f21b15e6 yum.repos.d]# ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
(base) [root@fc52f21b15e6 yum.repos.d]# echo "Asia/Shanghai" > /etc/timezone
(base) [root@fc52f21b15e6 yum.repos.d]# date
Wed Oct 26 18:22:38 CST 2022

安装vim
yum -y install vim*

切换conda源和pip源的指引在这里:
https://blog.csdn.net/weixin_40293999/article/details/126776913

2.安装pytorch 和 detectron2

几次下不下来,多来几次好使了 原因未知
conda install pytorch1.8.0 torchvision0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge

(base) [root@fc52f21b15e6 yum.repos.d]# conda install pytorch1.8.0 torchvision0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
PackagesNotFoundError: The following packages are not available from current channels:
  - torchvision0.9.0
  - pytorch1.8.0


and use the search bar at the top of the page.


(base) [root@fc52f21b15e6 yum.repos.d]# conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
Collecting package metadata (current_repodata.json): \                                                                                                                                                                                                                                                                                                done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done
## Package Plan ##
  environment location: /root/miniconda3
  added / updated specs:
    - cudatoolkit=11.1
    - pytorch==1.8.0
    - torchaudio==0.8.0
    - torchvision==0.9.0

反向打包,并推到hub仓库

### 反向打包
sudo docker commit mydet2 justin0114/cuda11.1_torch1.8_det2
### 推到hub仓库
sudo docker push justin0114/cuda11.1_torch1.8_det2

总结

到这里,这个det2的环境已经搞定了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值