在Win10平台上搭建AI训练(英伟达GPU加速)环境

1.环境

版本 Windows 10 企业版
版本号 21H2
安装日期 ‎2022/‎3/‎13
操作系统内部版本 19044.1826
体验 Windows Feature Experience Pack 120.2212.4180.0

2.搭建GPU加速环境

a.确定主机显卡驱动版本:在“NVIDIA控制面板”→“系统属性”查看显卡驱动版本

b.确定显卡驱动支持CUDA版本(网上有人说显卡驱动中给出的CUDA版本是此版本显卡驱动能支持的最高CUDA版本)

 

 

c. 通过nvidia官网(Release Notes :: CUDA Toolkit Documentation )确定系统所安装显卡驱动支持最高CUDA版本

Table 2. CUDA Toolkit and Minimum Required Driver Version for CUDA Minor Version Compatibility

CUDA ToolkitMinimum Required Driver Version for CUDA Minor Version Compatibility*
Linux x86_64 Driver VersionLinux AArch64 Driver VersionWindows x86_64 Driver Version
CUDA 11.7.x

>=450.80.02

>=452.39
CUDA 11.6.x
CUDA 11.5.x
CUDA 11.4.x
CUDA 11.3.x
CUDA 11.2.x
CUDA 11.1 (11.1.0)
CUDA 11.0 (11.0.3)>=450.36.06**>=450.28.01**>=451.22**

* Using a Minimum Required Version that is different from Toolkit Driver Version could be allowed in compatibility mode -- please read the CUDA Compatibility Guide for details.

** CUDA 11.0 was released with an earlier driver version, but by upgrading to Tesla Recommended Drivers 450.80.02 (Linux) / 452.39 (Windows), minor version compatibility is possible across the CUDA 11.x family of toolkits.

The version of the development NVIDIA GPU Driver packaged in each CUDA Toolkit release is shown below.

Table 3. CUDA Toolkit and Corresponding Driver Versions

CUDA ToolkitToolkit Driver Version
Linux x86_64 Driver VersionWindows x86_64 Driver Version
CUDA 11.7 GA>=515.43.04>=516.01
CUDA 11.6 Update 2>=510.47.03>=511.65
CUDA 11.6 Update 1>=510.47.03>=511.65
CUDA 11.6 GA>=510.39.01>=511.23
CUDA 11.5 Update 2>=495.29.05>=496.13
CUDA 11.5 Update 1>=495.29.05>=496.13
CUDA 11.5 GA>=495.29.05>=496.04
CUDA 11.4 Update 4>=470.82.01>=472.50
CUDA 11.4 Update 3>=470.82.01>=472.50
CUDA 11.4 Update 2>=470.57.02>=471.41
CUDA 11.4 Update 1>=470.57.02>=471.41
CUDA 11.4.0 GA>=470.42.01>=471.11
CUDA 11.3.1 Update 1>=465.19.01>=465.89
CUDA 11.3.0 GA>=465.19.01>=465.89
CUDA 11.2.2 Update 2>=460.32.03>=461.33
CUDA 11.2.1 Update 1>=460.32.03>=461.09
CUDA 11.2.0 GA>=460.27.03>=460.82
CUDA 11.1.1 Update 1>=455.32>=456.81
CUDA 11.1 GA>=455.23>=456.38
CUDA 11.0.3 Update 1>= 450.51.06>= 451.82
CUDA 11.0.2 GA>= 450.51.05>= 451.48
CUDA 11.0.1 RC>= 450.36.06>= 451.22
CUDA 10.2.89>= 440.33>= 441.22
CUDA 10.1 (10.1.105 general release, and updates)>= 418.39>= 418.96
CUDA 10.0.130>= 410.48>= 411.31
CUDA 9.2 (9.2.148 Update 1)>= 396.37>= 398.26
CUDA 9.2 (9.2.88)>= 396.26>= 397.44
CUDA 9.1 (9.1.85)>= 390.46>= 391.29
CUDA 9.0 (9.0.76)>= 384.81>= 385.54
CUDA 8.0 (8.0.61 GA2)>= 375.26>= 376.51
CUDA 8.0 (8.0.44)>= 367.48>= 369.30
CUDA 7.5 (7.5.16)>= 352.31>= 353.66
CUDA 7.0 (7.0.28)>= 346.46>= 347.62

For convenience, the NVIDIA driver is installed as part of the CUDA Toolkit installation. Note that this driver is for development purposes and is not recommended for use in production with Tesla GPUs.

For running CUDA applications in production with Tesla GPUs, it is recommended to download the latest driver for Tesla GPUs from the NVIDIA driver downloads site at Official Drivers | NVIDIA.

During the installation of the CUDA Toolkit, the installation of the NVIDIA driver may be skipped on Windows (when using the interactive or silent installation) or on Linux (by using meta packages).

For more information on customizing the install process on Windows, see Installation Guide Windows :: CUDA Toolkit Documentation.

For meta packages on Linux, see Installation Guide Linux :: CUDA Toolkit Documentation

d.从nvidia官网(CUDA Toolkit Archive | NVIDIA Developer )下载对应版本cuda(需要注册官方账号才能下载)

e.从nvidia官网(cuDNN Archive | NVIDIA Developer)下载cuda对应版本cudnn,解压cuDNN的安装压缩包,得到三个文件夹和一个txt,全部复制粘贴到cuda安装目录(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\V11.6)

f.安装Anaconda3-2022.05-Windows-x86_64

g.从官网(Start Locally | PyTorch)安装Pytorch

 

cd D:\SF\03.Win\01.Env\02.Python\Anaconda3-2022.05-Windows-x86_64-ai\Scripts

pip3.exe install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

在cmd中执行上一个命令时可能出现"Defaulting to user installation because normal site-packages is not writeable",是因为安装Anaconda3时是指定安装给所有用户使用(安装时提示获取管理员权限),而调用pip3时打开cmd是使用的当前非管理员账户。改用管理员账户打开cmd可以解决此问题。

在cmd中执行上一个命令时可能出现"Can't connect to HTTPS URL because the SSL module is not available"错误提示,是因为Anaconda3的lib没有被加入到系统path环境中,在cmd中执行 set PATH=%PATH%;D:\SF\03.Win\01.Env\02.Python\Anaconda3-2022.05-Windows-x86_64-ai\Library\bin

在cmd中执行上一个命令时可能出现网速慢的情况,可以给cmd设置代理,

set http_proxy=http://127.0.0.1:1189
set https_proxy=http://127.0.0.1:1189

测试安装是否成功,进入python:

import torch
print(torch.__version__)
不报错且输出版本号即正确。

torch.cuda.is_available()
这可以测试cuda是否可用

 

h.安装tensorflow    pip3.exe install tensorflow-gpu==2.9.0

 

测试安装是否成功

>>> import tensorflow as tf
>>> print(tf.test.is_gpu_available())

 

3.FQA

3.1.conda环境中通过pip安装包时提示文件名过长

Enable Long Paths in Windows 10

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem]
"LongPathsEnabled"=dword:00000001

3.2.在conda环境中通过conda安装包失败

     可以尝试使用pip安装

3.3.conda环境中通过pip安装包时提示“ClobberError: This transaction has incompatible packages due to a shared path”,

      pip安装tensorflow-gpu后提示"cudart64_101.dll not found"

    修改用户目录中的文件C:\Users\yu\.condarc,

channels:
- defaults

将channels修改如下

channels:
- conda-forge

3.4.Anaconda3安装时,可以指定“给当前用户”;否则使用Anaconda3的软件必须以管理员身份运行

3.5.cmd设置或修改环境变量

set PATH=%PATH%;D:\SF\03.Win\01.Env\02.Python\Anaconda3-2022.05-Windows-x86_64-3.9.12\envs\conda-py-3.7\Library\bin

set PATH=%PATH%;D:\SF\03.Win\01.Env\02.Python\Anaconda3-2022.05-Windows-x86_64-3.9.12\Library\bin
set PATH=%PATH%;D:\SF\03.Win\01.Env\02.Python\Anaconda3-2022.05-Windows-x86_64-3.9.12-jm\Library\bin

3.6.cmd设置proxy

set https_proxy=http://127.0.0.1:8002
set http_proxy=http://127.0.0.1:8002

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值