【2025深度学习环境搭建-2】pytorch+Docker+VS Code+DevContainer搭建本地深度学习环境

tao355667

已于 2025-03-02 17:26:56 修改

阅读量1.9k

点赞数 14

分类专栏：深度学习环境搭建文章标签：深度学习 pytorch docker

于 2025-02-24 22:31:40 首次发布

本文链接：https://blog.csdn.net/m0_63070489/article/details/145813739

版权

深度学习环境搭建专栏收录该内容

5 篇文章

订阅专栏

上一篇文章：【2025深度学习环境搭建-1】在Win11上用WSL2和Docker解锁GPU加速

先启动Docker！
对文件内容有疑问，就去问AI

一、用Docker拉取pytorch镜像，启动容器，测试GPU

docker pull pytorch/pytorch:2.5.0-cuda12.4-cudnn9-devel

在这里插入图片描述
docker run -it --rm --gpus all pytorch/pytorch:2.5.0-cuda12.4-cudnn9-devel nvidia-smi

别忘了用--gpus all启用GPU

在这里插入图片描述
能出现显卡信息，说明基于该镜像的容器，是可以用gpu的。之后要把这个镜像应用到到我们的开发环境之中（使用VS Code插件Dev Container）

二、安装VS Code插件

在这里插入图片描述

三、创建项目文件（测试pytorch和GPU的python程序）

项目文件的github地址

创建文件夹pytorch-test，并在其目录下创建如下文件夹和文件(主要创建app.py和.devcontainer就行，其他的随意)：
在这里插入图片描述

需要创建的文件，内容如下：

requirements.txt

这个文件内容为空

app.py

import torch
a=[1,23,4,5,.4]
def print_gpu_info():
    # 检查CUDA是否可用
    cuda_available = torch.cuda.is_available()
    print(f"CUDA 是否可用: {cuda_available}")
    
    if not cuda_available:
        return
    
    # 获取GPU数量
    device_count = torch.cuda.device_count()
    print(f"\n可用的GPU数量: {device_count}")
    
    # 打印每个GPU的详细信息
    for i in range(device_count):
        print(f"\n=== GPU {i} ===")
        print(f"名称: {torch.cuda.get_device_name(i)}")
        prop = torch.cuda.get_device_properties(i)
        print(f"总内存: {prop.total_memory / 1024**3:.2f} GB")
        print(f"多处理器数量: {prop.multi_processor_count}")
        print(f"计算能力: {prop.major}.{prop.minor}")

def test_gpu_operation():
    # 尝试在GPU上执行操作
    if torch.cuda.is_available():
        try:
            # 创建测试张量
            x = torch.randn(3, 3).cuda()
            y = torch.randn(3, 3).cuda()
            z = x + y  # 执行GPU计算
            
            # 验证设备类型
            print("\n=== GPU 操作测试 ===")
            print(f"张量所在设备: {x.device}")
            print("GPU 计算成功！")
            return True
        except Exception as e:
            print(f"\nGPU 操作失败: {str(e)}")
            return False
    else:
        print("没有可用的GPU进行测试")
        return False

if __name__ == "__main__":
    print("===== PyTorch GPU 信息 =====")
    print_gpu_info()
    
    print("\n===== GPU 功能测试 =====")
    test_result = test_gpu_operation()
    
    print("\n===== 最终状态 =====")
    print(f"GPU 是否可用: {torch.cuda.is_available()}")
    print(f"GPU 是否可用: {test_result}")
    print(f"PyTorch 版本: {torch.__version__}")

.devcontainer/devcontainer.json

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-dockerfile
{
	"name": "GPU Development,torch2.5+cu124+cudnn9,Py3.11.10",
	"runArgs": [
		"--gpus=all"  // 添加 GPU 支持
	],
	"build": {
		// Sets the run context to one level up instead of the .devcontainer folder.
		"context": "..",
		// Update the 'dockerFile' property if you aren't using the standard 'Dockerfile' filename.
		"dockerfile": "Dockerfile"
	},
	"customizations": {
		"vscode": {
			"extensions": [
				"ms-python.python",
				"ms-toolsai.jupyter",
				"ms-python.autopep8",
				"ms-python.vscode-pylance",
				"mechatroner.rainbow-csv",
				"ms-azuretools.vscode-docker",
				"ms-toolsai.datawrangler"
			]
		}
	}

	// Features to add to the dev container. More info: https://containers.dev/features.
	// "features": {},

	// Use 'forwardPorts' to make a list of ports inside the container available locally.
	// "forwardPorts": [],

	// Uncomment the next line to run commands after the container is created.
	// "postCreateCommand": "cat /etc/os-release",

	// Configure tool-specific properties.
	// "customizations": {},

	// Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.
	// "remoteUser": "devcontainer"
}

.devcontainer/Dockerfile

# 使用 PyTorch 官方镜像作为基础镜像
FROM pytorch/pytorch:2.5.0-cuda12.4-cudnn9-devel

# 设置工作目录（容器中的）
WORKDIR /workspace

# 将本地代码复制到容器中
COPY . /workspace

# 安装额外的依赖（如果有）
RUN pip install --no-cache-dir -r requirements.txt

# 暴露端口（如果有需要）
# EXPOSE 8000

# 定义容器启动时运行的命令
# CMD ["python", "app.py"]

README.md

## pip环境导入导出
从requirements.txt导入环境：
`pip install --no-cache-dir -r requirements.txt`
导出环境到文件requirements.txt：
`pip freeze | grep -v '@ file://' > requirements.txt`

四、打开项目文件，并使用容器环境

在VS Code中打开项目文件
在这里插入图片描述
按下【F1】在上方选择【Dev Containers:Reopen in Container】

此时查看vscode左下角，蓝底白字，显示Dev Container: GPU Development,torch2.5+..，就说明我们现在的项目torch-test已经在使用刚才拉取的pytorch容器了！

在左边找到app.py，运行他，若显示可用gpu大于0，表示项目torch-test中的python程序可以使用gpu。之后我们需要运行深度学习程序时，使用这里的步骤即可，不需要安装额外的python环境了，若需要安装其他包，那就修改requirements.txt文件即可。

在这里插入图片描述

五、需要安装其他python包怎么办？

若我们需要其他python包，那就在终端直接安装，测试能用之后，用pip freeze | grep -v '@ file://' > requirements.txt将当前python环境中的包导出到文件requirements.txt中。
之后再启动项目时，Dev Container会自动帮我们根据文件requirements.txt安装环境。