Pytorch GPU环境搭建-博客导航

安装

安装VS(CUDA需要VS)

2017,2019,2022都可

安装CUDA

Cuda和cuDNN安装教程(超级详细)
查看安装的CUDA
CUDA版本不同:nvidia-smi和nvcc -V

安装CUDNN

Cuda和cuDNN安装教程(超级详细)

创建Pytorch GPU虚拟环境

1.创建虚拟环境
建议创建虚拟环境,你也可以公用,但是不提倡,因为有些场景用到的库版本不一样,以及全部放在一起的话,环境就会非常大,python本身就是一个体积小巧的脚本语言

名字随便取,我这里叫cls_py38_gpu

conda create -n cls_py38_gpu python=3.8

2.安装Pytorch
注意选择CUDA,然后版本这里选择使用11.8
在这里插入图片描述
conda安装

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

注意,如果你没有科学上网方法,那么建议用pip或者pip3安装,这两个exe在你的虚拟环境中在这里插入图片描述

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

无论是conda还是pip下载有可能会失败,重新执行命令即可

测试

疑难杂症解决链接

调用CUDA时报错
Could not locate zlibwapi.dll. Please make sure it is in your library path
解决问题:Could not locate zlibwapi.dll. Please make sure it is in your library path!
导出ONNX报错
Exporting AdaptiveAvgPool2d to ONNX with ATen fallback produces an error #17377
Unsupported: ONNX export of operator adaptive_avg_pool2d
训练loss梯度不下降或下降幅度不明显
SGD & Adam优化器
Why doesn’t the accuracy when training VGG-16 change much?
ONNX-gpu推理
How do you run a ONNX model on a GPU?

搭建VGG分类网络并用CUDA训练

完整代码见文末Github仓库


if __name__ == '__main__':
    if torch.cuda.is_available():
        device = torch.device("cuda:0")
        print("Running on the GPU")
        num_gpu=torch.cuda.device_count()
        print("there are {} gpu on you computer".format(num_gpu))
    else:
        device = torch.device("cpu")
        print("Running on the CPU")

    model = VGG(image_channels,num_classes).to(device)
    optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=weight_decay)

    criterion =CrossEntropyLoss()

    test_list, train_list = get_files(dataset_folder, test_data_ratio)

    train_loader = DataLoader(MyDataset(train_list, transform=None, test=False), batch_size=batch_size, shuffle=True,
                              collate_fn=collate_fn)
    test_loader = DataLoader(MyDataset(test_list, transform=None, test=True), batch_size=batch_size, shuffle=True,
                             collate_fn=collate_fn)
    print("训练集数量{}", train_list.__len__())
    print("测试集数量{}", test_list.__len__())
    accuracies = []
    test_loss = []
    train_loss = []
    current_accuracy = 0
    model.train()
    for epoch in range(epochs):
        start_time = datetime.now()
        loss_epoch = 0
        for index, (input, target) in enumerate(train_loader):

            input = (input.to(device))
            target = (from_numpy(array(target)).long()).to(device)
            output = model(input)
            loss = criterion(output, target)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            loss_epoch += loss.item()

        end_time = datetime.now()
        print("epoch:{},耗时: {}秒".format(epoch,end_time - start_time))
        if (epoch + 1) % train_step_interval == 0:
            print("Epoch: {} \t Loss: {:.6f} ".format(epoch + 1, loss_epoch))

使用CUDA加速推理分类网络

完整代码见文末Github仓库

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
if __name__ == '__main__':
    with open('config.json') as f:
        param_dict = json.load(f)
    class_dict = dict()
    for i in range(len(param_dict["class_labels"])):
        class_dict[i] = param_dict["class_labels"][i]

    print("test class dict{}", class_dict)
    num_classes = len(param_dict["class_labels"])
    image_channels = param_dict["image_channels"]
    model = VGG(image_channels, num_classes)
    utils.load_model("checkpoints/mnist.pth", model)
    model = model.to(device)
    print(model)

    test_list = utils.get_allfiles(
        r"I:\test_images_full")

    test_loader = DataLoader(MyDataset(test_list, transform=None, test=True), batch_size=1, shuffle=True,
                             collate_fn=utils.collate_fn)
    correct_num = 0
    step = 0
    total_num = len(test_list)
    with torch.no_grad():
        for item in test_loader:
            image, label = item
            image = image.to(device)
            label = label
            output = model(image)
            # print(class_dict.__getitem__(numpy.argmax(output.numpy())))
            # label是list类型,需要转成tensor,output输出n分类的得分,需要求最大下标
            res = torch.eq(torch.from_numpy(numpy.array(label)).long().to(device), torch.argmax(output))
            step = step + 1
            if (res):
                correct_num = correct_num + 1
            if (step % 100 == 0):
                print("{}/{},current accuracy{:.4f}".format(step, total_num, correct_num / step))
    print("[{}/{}],correct rate:{}".format(correct_num, len(test_list), correct_num / len(test_list)))

C#使用ONNXruntime-gpu推理

var useCuda = true;
if (useCuda)
{
    SessionOptions opts = SessionOptions.MakeSessionOptionWithCudaProvider();
    var session = new InferenceSession(modelPath, opts);
    return session;
}

else
{
    SessionOptions opts = new();
    var session = new InferenceSession(modelPath, opts);
    return session;
}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值