PyTorch三种保存方法 模型保存和继续训练的完整例程

29 篇文章 2 订阅
3 篇文章 0 订阅

pytorch数据加载、模型保存及加载

https://blog.csdn.net/FPGATOM/article/details/85337469
https://www.jb51.net/article/167892.htm
继续训练的保存方法:
http://www.mamicode.com/info-detail-3011246.html

torch.train() torch.eval()
https://blog.csdn.net/weixin_43593330/article/details/103365671

以下是三种方式,作者亲自试过

# torch.save(net.state_dict(), 'net_parameter.pth')

# # net = Net(*args, **kwargs)
# net.load_state_dict(torch.load('net_parameter.pth'))
# net.eval()

# torch.save(net, 'net_model.pth')
# net = torch.load('net_model.pth')

state = {
            'epoch': epoch,
            'model_state_dict': net.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss2000': loss2000,
        }
torch.save(state, 'net_train.pth')

model_path="."
listfile = [i for i in os.listdir(model_path) if i.endswith("pth")] 
if len(listfile) != 0:
#  if listfile is not None:
   # model_path = model_path + listfile[0]  
   # model = load_model(model_path)  
   PATH = 'net_train.pth'
   checkpoint = torch.load(PATH)
   net.load_state_dict(checkpoint['model_state_dict'])
   optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
   epoch = checkpoint['epoch']
   initial_epoch = epoch + 1
   loss2000 = checkpoint['loss2000']
   print("Model's state_dict:")
   for param_tensor in net.state_dict():
       print(param_tensor, "\t", net.state_dict()[param_tensor].size())
   print("running_loss:")
   print(loss2000)  
else:
   initial_epoch = 0  

保存模型继续训练的完整例程

# -*- coding: utf-8 -*-
"""
Training a Classifier
=====================

This is it. You have seen how to define neural networks, compute loss and make
updates to the weights of the network.

Now you might be thinking,

What about data?
----------------

Generally, when you have to deal with image, text, audio or video data,
you can use standard python packages that load data into a numpy array.
Then you can convert this array into a ``torch.*Tensor``.

-  For images, packages such as Pillow, OpenCV are useful
-  For audio, packages such as scipy and librosa
-  For text, either raw Python or Cython based loading, or NLTK and
   SpaCy are useful

Specifically for vision, we have created a package called
``torchvision``, that has data loaders for common datasets such as
Imagenet, CIFAR10, MNIST, etc. and data transformers for images, viz.,
``torchvision.datasets`` and ``torch.utils.data.DataLoader``.

This provides a huge convenience and avoids writing boilerplate code.

For this tutorial, we will use the CIFAR10 dataset.
It has the classes: ‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’,
‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’. The images in CIFAR-10 are of
size 3x32x32, i.e. 3-channel color images of 32x32 pixels in size.

.. figure:: /_static/img/cifar10.png
   :alt: cifar10

   cifar10


Training an image classifier
----------------------------

We will do the following steps in order:

1. Load and normalizing the CIFAR10 training and test datasets using
   ``torchvision``
2. Define a Convolutional Neural Network
3. Define a loss function
4. Train the network on the training data
5. Test the network on the test data

1. Loading and normalizing CIFAR10
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Using ``torchvision``, it’s extremely easy to load CIFAR10.
"""
import torch
import torchvision
import torchvision.transforms as transforms

########################################################################
# The output of torchvision datasets are PILImage images of range [0, 1].
# We transform them to Tensors of normalized range [-1, 1].

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='../data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='../data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

########################################################################
# Let us show some of the training images, for fun.

import matplotlib.pyplot as plt
import numpy as np

# functions to show an image


def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()


# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
# imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))


########################################################################
# 2. Define a Convolutional Neural Network
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Copy the neural network from the Neural Networks section before and modify it to
# take 3-channel images (instead of 1-channel images as it was defined).

import os
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

########################################################################
# 3. Define a Loss function and optimizer
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Let's use a Classification Cross-Entropy loss and SGD with momentum.

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

########################################################################
# 4. Train the network
# ^^^^^^^^^^^^^^^^^^^^
#
# This is when things start to get interesting.
# We simply have to loop over our data iterator, and feed the inputs to the
# network and optimize.

model_path="."
listfile = [i for i in os.listdir(model_path) if i.endswith("pth")] 
if len(listfile) != 0:
#  if listfile is not None:
   # model_path = model_path + listfile[0]  
   # model = load_model(model_path)  
   PATH = 'net_train.pth'
   checkpoint = torch.load(PATH)
   net.load_state_dict(checkpoint['model_state_dict'])
   optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
   epoch = checkpoint['epoch']
   initial_epoch = epoch + 1
   loss2000 = checkpoint['loss2000']
   print("Model's state_dict:")
   for param_tensor in net.state_dict():
       print(param_tensor, "\t", net.state_dict()[param_tensor].size())
   # latest=tf.train.latest_checkpoint(model_path)
   #    json_string = model.to_json()
   #    print(json_string)
   # Finding the epoch index from which we are resuming
   # initial_epoch = get_init_epoch(checkpoint_path)
   # initial_epoch=int(listfile[0].split(".")[1])
   # print("initial_epoch = %d"%int(initial_epoch))
   # Calculating the correct value of count
   # count = initial_epoch*batchsize
   # Update the value of count in callback instance
   # callbacks_list[1].count = count 
   print("running_loss:")
   print(loss2000)  
else:
   initial_epoch = 0  

print("initial_epoch:")
print(initial_epoch)
end_epoch = 3  
for epoch in range(initial_epoch, end_epoch):  # loop over the dataset multiple times
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            loss2000 = running_loss
            running_loss = 0.0

# print('Finished Training')

# torch.save(net.state_dict(), 'net_parameter.pth')

# net = Net(*args, **kwargs)
# net.load_state_dict(torch.load('net_parameter.pth'))
# net.eval()

# torch.save(net, 'net_model.pth')
# net = torch.load('net_model.pth')

state = {
            'epoch': epoch,
            'model_state_dict': net.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss2000': loss2000,
        }
torch.save(state, 'net_train.pth')

# model = Net(*args, **kwargs)
# optimizer = TheOptimizerClass(*args, **kwargs)

net.eval()   #预测
# # - or -
# model.train() #再训练

# #class Net(nn.Module):
# model = Net()
# optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

# print("Optimizer's state_dict:")
# for var_name in optimizer.state_dict():
    # print(var_name, "\t", optimizer.state_dict()[var_name])
    
 # CPU模型加载GPU参数
# model.load_state_dict(torch.load('model.pth', map_location='cpu'))





########################################################################
# 5. Test the network on the test data
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# We have trained the network for 2 passes over the training dataset.
# But we need to check if the network has learnt anything at all.
#
# We will check this by predicting the class label that the neural network
# outputs, and checking it against the ground-truth. If the prediction is
# correct, we add the sample to the list of correct predictions.
#
# Okay, first step. Let upytorch数据加载、模型保存及加载s display an image from the test set to get familiar.

dataiter = iter(testloader)
images, labels = dataiter.next()
print("labels = {}".format(labels[0]))
# print images
# imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

########################################################################
# Okay, now let us see what the neural network thinks these examples above are:

outputs = net(images)

########################################################################
# The outputs are energies for the 10 classes.
# Higher the energy for a class, the more the network
# thinks that the image is of the particular class.
# So, let's get the index of the highest energy:
_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))

########################################################################
# The results seem pretty good.
#
# Let us look at how the network performs on the whole dataset.

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

########################################################################
# That looks waaay better than chance, which is 10% accuracy (randomly picking
# a class out of 10 classes).
# Seems like the network learnt something.
#
# Hmmm, what are the classes that performed well, and the classes that did
# not perform well:

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

########################################################################
# Okay, so what next?
#
# How do we run these neural networks on the GPU?
#
# Training on GPU
# ----------------
# Just like how you transfer a Tensor on to the GPU, you transfer the neural
# net onto the GPU.
#
# Let's first define our device as the first visible cuda device if we have
# CUDA available:

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Assume that we are on a CUDA machine, then this should print a CUDA device:

print(device)

########################################################################
# The rest of this section assumes that `device` is a CUDA device.
#
# Then these methods will recursively go over all modules and convert their
# parameters and buffers to CUDA tensors:
#
# .. code:: python
#
#     net.to(device)
#
#
# Remember that you will have to send the inputs and targets at every step
# to the GPU too:
#
# .. code:: python
#
#         inputs, labels = inputs.to(device), labels.to(device)
#
# Why dont I notice MASSIVE speedup compared to CPU? Because your network
# is realllly small.
#
# **Exercise:** Try increasing the width of your network (argument 2 of
# the first ``nn.Conv2d``, and argument 1 of the second ``nn.Conv2d`` –
# they need to be the same number), see what kind of speedup you get.
#
# **Goals achieved**:
#
# - Understanding PyTorch's Tensor library and neural networks at a high level.
# - Train a small neural network to classify images
#
# Training on multiple GPUs
# -------------------------
# If you want to see even more MASSIVE speedup using all of your GPUs,
# please check out :doc:`data_parallel_tutorial`.
#
# Where do I go next?
# -------------------
#
# -  :doc:`Train neural nets to play video games </intermediate/reinforcement_q_learning>`
# -  `Train a state-of-the-art ResNet network on imagenet`_
# -  `Train a face generator using Generative Adversarial Networks`_
# -  `Train a word-level language model using Recurrent LSTM networks`_
# -  `More examples`_
# -  `More tutorials`_
# -  `Discuss PyTorch on the Forums`_
# -  `Chat with other users on Slack`_
#
# .. _Train a state-of-the-art ResNet network on imagenet: https://github.com/pytorch/examples/tree/master/imagenet
# .. _Train a face generator using Generative Adversarial Networks: https://github.com/pytorch/examples/tree/master/dcgan
# .. _Train a word-level language model using Recurrent LSTM networks: https://github.com/pytorch/examples/tree/master/word_language_model
# .. _More examples: https://github.com/pytorch/examples
# .. _More tutorials: https://github.com/pytorch/tutorials
# .. _Discuss PyTorch on the Forums: https://discuss.pytorch.org/
# .. _Chat with other users on Slack: https://pytorch.slack.com/messages/beginner/
### 回答1: PyTorch训练好的模型可以通过以下步骤进行保存和使用: 1. 保存模型:使用torch.save()函数将模型保存到文件中,例如: ``` torch.save(model.state_dict(), 'model.pth') ``` 其中,model是训练好的模型,state_dict()函数返回模型的参数字典,'model.pth'是保存的文件名。 2. 加载模型:使用torch.load()函数加载保存模型文件,例如: ``` model.load_state_dict(torch.load('model.pth')) ``` 其中,model是定义好的模型,'model.pth'是保存的文件名。 3. 使用模型:加载模型后,可以使用模型进行预测或者继续训练,例如: ``` output = model(input) ``` 其中,input是输入数据,output是模型的输出结果。 需要注意的是,保存和加载模型时需要保证模型的结构和参数一致,否则会出现错误。另外,保存模型文件可以在不同的设备上使用,例如在CPU和GPU上进行预测。 ### 回答2: PyTorch是一种深度学习框架,可以用来开发和训练各种人工智能模型,包括卷积神经网络(CNNs)、循环神经网络(RNNs)和变压器神经网络(Transformers)。当你在PyTorch训练好了一个模型之后,你可能想把它保存下来,以便以后使用。 PyTorch中有多种方式可以保存模型,以下是其中几种: 1. 保存整个模型 如果你想保存整个模型,包括它的参数、权重、结构和优化器,可以使用以下代码: ```python torch.save(model, PATH) ``` 该代码将整个模型保存到名为PATH的文件中。 2. 保存模型参数 如果你只想保存模型的参数和权重,可以使用以下代码: ```python torch.save(model.state_dict(), PATH) ``` 该代码将模型的状态字典保存到名为PATH的文件中。 3. 加载保存模型 如果你想加载保存模型,以便在以后使用,可以使用以下代码: ```python model = torch.load(PATH) ``` 该代码将模型从名为PATH的文件中加载出来。 4. 加载模型参数 如果你只是想加载模型的参数和权重,可以使用以下代码: ```python model.load_state_dict(torch.load(PATH)) ``` 该代码将模型的状态字典从名为PATH的文件中加载出来。 5. 使用保存模型进行预测 一旦你加载了保存模型,就可以使用它来做出预测。以下是一个基本的使用模型进行预测的示例代码: ```python outputs = model(inputs) ``` 该代码将模型应用于输入,并返回输出。你可以使用这些输出来生成预测。 ### 回答3: PyTorch是一个开源深度学习框架,它具有简单易用、灵活、高效等特点。在使用PyTorch进行深度学习任务时,我们通常需要保存训练好的模型,以供后续使用。下面是关于PyTorch训练好的模型保存和使用的详细说明: 1. 保存模型 我们可以使用torch.save()函数来保存模型。该函数接受两个参数:要保存模型和要保存的文件名。例如,我们要将训练好的模型保存为“model.pt”,可以使用以下代码: torch.save(model, 'model.pt') 此时,模型将被保存到当前目录下的“model.pt”文件中。我们也可以指定保存的文件路径,例如: torch.save(model, './saved_models/model.pt') 这样,模型将被保存到当前目录下的“saved_models”目录中,在该目录下创建名为“model.pt”的文件。 2. 加载模型 使用torch.load()函数可以加载已保存模型。该函数接受一个参数:要加载的文件名或路径。例如,我们要加载名为“model.pt”的模型,可以使用以下代码: model = torch.load('model.pt') 此时,已保存模型将会被加载到变量“model”中。需要注意的是,我们必须先定义一个与保存模型时相同的模型结构,才能正确地加载模型。 3. 使用模型 经过保存和加载之后,我们就可以使用训练好的模型来进行应用。以分类任务为例,我们可以使用以下代码对一张图片进行分类: import torch.nn.functional as F import torchvision.transforms as transforms from PIL import Image # 加载模型 model = torch.load('model.pt') # 读取图片并进行预处理 img = Image.open('test.jpg') transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]) img_tensor = transform(img) img_tensor = img_tensor.unsqueeze(0) # 使用模型进行推理 with torch.no_grad(): outputs = model(img_tensor) _, predicted = torch.max(outputs.data, 1) print(predicted.item()) 在上述代码中,我们先使用torch.load()函数加载模型,然后读取一张图片,并通过transforms.Compose()函数对其进行预处理。最后,我们使用模型对处理后的图片进行推理,并输出分类结果。 总之,PyTorch提供了简单易用的模型保存和加载方法,使我们能够轻松地训练保存和使用深度学习模型
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值