pytorch 学习1(官网的文档的学习)

目录

 

LEARN THE BASICS

1、TENSORS

①、创建一个tensor 变量的方法

 ②、tensor 的属性

③、operations of tensor

④、Bridge with NumPy

2、DATASETS & DATALOADERS

①、Loading a Dataset

 ②、Iterating(迭代) and Visualizing the Dataset

3、 Creating a Custom Dataset for your files(自定义数据集)

①、首先要收集所需的图片,整理成一个文件夹

②、自定义数据集

4、Preparing your data for training with DataLoaders

5、TRANSFORMS---变换

6、BUILD THE NEURAL NETWORK

7、Autograd---自动求导

AUTOMATIC DIFFERENTIATION WITH TORCH.AUTOGRAD

More on Computational Graphs

 8、OPTIMIZING MODEL PARAMETERS

9、SAVE AND LOAD THE MODEL

①、模型的权重的保存与加载

②、Saving and Loading Models with Shapes,模型的形状


LEARN THE BASICS

Most machine learning workflows(工作流程) involve working with data, creating models, optimizing model parameters, and saving the trained models. This tutorial(教程) introduces you to a complete ML workflow implemented in PyTorch, with links to learn more about each of these concepts.

We’ll use the FashionMNIST dataset to train a neural network that predicts if an input image belongs to one of the following classes: T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, or Ankle boot.

This tutorial assumes a basic familiarity with Python and Deep Learning concepts.

1、TENSORS

Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators.

①、创建一个tensor 变量的方法

import numpy as np
import torch


#Initializing a Tensor
#Directly from data
data = [[1,2],[3,4]]
x_data = torch.tensor(data)
print(f"first way: \n {x_data} \n")

#From a NumPy array
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
print(f"second way: \n {x_np} \n")

#From another tensor
'''
The new tensor retains the properties (shape, datatype) of the argument tensor, 
unless explicitly overridden.
'''
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides(类似函数重载的意思) the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

#With random or constant values

shape = (2,3,)  #is a tuple of tensor dimensions.
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor} \n")

 ②、tensor 的属性

import numpy as np
import torch


'''
Tensor attributes describe their shape, datatype,
and the device on which they are stored.
'''
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

③、operations of tensor

Each of these operations can be run on the GPU (at typically higher speeds than on a CPU). If you’re using Colab, allocate a GPU by going to Runtime > Change runtime type > GPU.

By default, tensors are created on the CPU. We need to explicitly move tensors to the GPU using .to method (after checking for GPU availability). Keep in mind that copying large tensors across devices can be expensive in terms of time and memory!

使用GPU

import numpy as np
import torch


'''
使用GPU
'''
tensor = torch.rand(3,4)

# We move our tensor to the GPU if available
if torch.cuda.is_available():
  tensor = tensor.to('cuda')

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

tensor 的行列相关

import numpy as np
import torch

'''
Try out some of the operations from the list. 
If you’re familiar with the NumPy API, you’ll find 
the Tensor API a breeze to use.
'''

tensor = torch.ones(4, 4)
print('First row: ',tensor[0])
print('First column: ', tensor[:, 0])
print('Last column:', tensor[..., -1])
tensor[:,1] = 0
print(tensor)

'''
在一个tensor上加入其他的tensor
Joining tensors You can use torch.cat to concatenate 
a sequence of tensors along a given dimension. 
See also torch.stack, another tensor joining op 
that is subtly different from torch.cat.
dim = 0, 列
dim = 1, 行
'''
t1 = torch.cat([tensor,tensor,tensor],dim=1)
print(t1)

Arithmetic operations--算术运算

矩阵乘法,矩阵元素乘?

import numpy as np
import torch

tensor = torch.rand(3,4)

# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
y1 = tensor @ tensor.T # 矩阵 Tesor 乘 tensor 的转置矩阵
y2 = tensor.matmul(tensor.T)

y3 = torch.rand_like(tensor)
torch.matmul(tensor, tensor.T, out=y3)

y_list = [y1,y2,y3]
for y in y_list:
    print(y)


# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor #数乘
z2 = tensor.mul(tensor)

z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)

z_list = [z1,z2,z3]
for z in z_list:
    print(z)

将tensor 量转为python 变量

import numpy as np
import torch

tensor = torch.ones(3,4)

'''
Single-element tensors If you have a one-element tensor, 
for example by aggregating all values of a tensor into one value, 
you can convert it to a Python numerical value using item():
'''
agg = tensor.sum()
agg_item = agg.item()
print(agg_item, type(agg_item))

 就地操作

import numpy as np
import torch

tensor = torch.ones(3,4)

'''
In-place operations    就地操作
In-place operations Operations that store the result into the operand are called in-place. 
They are denoted by a _ suffix. For example: x.copy_(y), x.t_(), will change x.

In-place operations save some memory, but can be problematic 
when computing derivatives because of an immediate loss of history. Hence, their use is discouraged.
'''
print(tensor, "\n")
tensor.add_(5)
print(tensor)

④、Bridge with NumPy

Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other.

import numpy as np
import torch

print(1)
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy() # tensor to numpy
print(f"n: {n}")

#A change in the tensor reflects in the NumPy array.
print(2)
t.add_(1) #tensor 改变 导致numpy 改变
print(f"t: {t}")
print(f"n: {n}")


print(3)
n = np.ones(5)
t = torch.from_numpy(n)  #numpy to tensor
np.add(n, 1, out=n) #numpy 改变 并且 tensor 改变
print(f"t: {t}")
print(f"n: {n}")

2、DATASETS & DATALOADERS

Code for processing data samples can get messy and hard to maintain; we ideally want our dataset code to be decoupled(解耦) from our model training code for better readability and modularity(可读性和模块化)

①、Loading a Dataset

 

 ②、Iterating(迭代) and Visualizing the Dataset

import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt


training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

'''
We can index Datasets manually like a list: training_data[index].
We use matplotlib to visualize some samples in our training data.
'''
labels_map = {
    0: "T-Shirt",
    1: "Trouser",
    2: "Pullover",
    3: "Dress",
    4: "Coat",
    5: "Sandal",
    6: "Shirt",
    7: "Sneaker",
    8: "Bag",
    9: "Ankle Boot",
}
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
#随机从训练集中的图片中抽取部分进行显示
for i in range(1, cols * rows + 1):
    sample_idx = torch.randint(len(training_data), size=(1,)).item() 
    img, label = training_data[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.title(labels_map[label])
    plt.axis("off")
    plt.imshow(img.squeeze(), cmap="gray")
plt.show()

3、 Creating a Custom Dataset for your files(自定义数据集)

A custom Dataset class must implement three functions: __init__, __len__, and __getitem__.

Take a look at this implementation; the FashionMNIST images are stored in a directory img_dir, and their labels are stored separately in a CSV file annotations_file.

The __init__ function is run once when instantiating the Dataset object. We initialize the directory containing the images(包含图像的目录), the annotations file(文件夹), and both transforms(变换).

The __getitem__ function loads and returns a sample from the dataset at the given index idx. Based on the index, it identifies the image’s location on disk, converts that to a tensor using read_image, retrieves the corresponding label from the csv data in self.img_labels, calls the transform functions on them (if applicable), and returns the tensor image and corresponding label in a tuple.

①、首先要收集所需的图片,整理成一个文件夹

用下面的code,生成txt文件,用于指明每个图像和标签的对应关系。

# -*- coding:utf-8 -*-

import os


outer_path = 'E:/研究生/数字图像处理/python/pytorch2/data1/test/'  # 这里是你的图片的目录,再下一级就是不同类别的图片的文件夹,

def generate(category,dir, label):
    files = os.listdir(dir)
    files.sort()
    print( '****************')
    print('input :', dir)
    print('start...')
    listText = open(outer_path + 'all_list.txt', 'a') #保存路径的修改
    for file in files:
        fileType = os.path.split(file)
        if fileType[1] == '.txt':
            continue
        print(file)
        name = '/'+category +'/'+ file + ' ' + str(int(label)) + '\n'
        listText.write(name)
    listText.close()
    print('down!')
    print('****************')





if __name__ == '__main__':
    i = 0
    folderlist = os.listdir(outer_path)  # 列举文件夹
    for folder in folderlist:
        print(folder)
        generate(folder,os.path.join(outer_path, folder), i)
        i += 1

si

②、自定义数据集

import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt
import os
import pandas as pd
#from torchvision.io import read_image
import cv2 as cv
import numpy as np


'''
建立自己的dataset
'''


# step1: 定义MyDataset类, 继承Dataset, 重写抽象方法:__len()__, __getitem()__
class MyDataset(Dataset):

    def __init__(self, root_dir, names_file, transform=None):
        self.root_dir = root_dir

        self.names_file = names_file
        self.transform = transform
        self.size = 0
        self.names_list = []

        if not os.path.isfile(self.names_file):
            print(self.names_file + 'does not exist!')
        file = open(self.names_file)
        for f in file:
            self.names_list.append(f)
            self.size += 1

    def __len__(self):
        return self.size

    def __getitem__(self, idx):
        image_path = self.root_dir + self.names_list[idx].split(' ')[0]

        if not os.path.isfile(image_path):
            print(image_path + 'does not exist!')
            return None
        image = cv.imread(image_path)  # use cv
        #cv 和 plt 的彩色图像的通道顺序不一样

        label = int(self.names_list[idx].split(' ')[1])

        sample = {'image': image, 'label': label} #建立一个字典变量
        if self.transform:
            sample = self.transform(sample)

        return sample


train_dataset = MyDataset(root_dir='./data1/test',
                          names_file='./data1/test/all_list.txt',
                          transform=None)


print(train_dataset.__len__())

train_dataset = MyDataset(root_dir='./data1/test',
                          names_file='./data1/test/all_list.txt',
                          transform=None)

plt.figure()
for (cnt, i) in enumerate(train_dataset):
    image = i['image']
    label = i['label']

    ax = plt.subplot(2, 2, cnt + 1)
    ax.axis('off')
    ax.imshow(image)
    ax.set_title('label {}'.format(label))
    plt.pause(2)

    if cnt == 4:
        break

4、Preparing your data for training with DataLoaders

The Dataset retrieves(检索) our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in “minibatches”, reshuffle the data at every epoch to reduce model overfitting, and use Python’s multiprocessing(多线程) to speed up data retrieval.

DataLoader is an iterable(可迭代的) that abstracts(简化) this complexity for us in an easy API.

简言之:

深度学习中,样本的形式---“minibatch”

样本加载时候可能需要随机打乱顺序,shuffle操作

多线程进行处理

import torch
from torch.utils.data import Dataset
import matplotlib.pyplot as plt
import os
import cv2 as cv
from torch.utils.data import Dataset, DataLoader
from torchvision.utils import make_grid


'''
建立自己的dataset
'''


# step1: 定义MyDataset类, 继承Dataset, 重写抽象方法:__len()__, __getitem()__
class MyDataset(Dataset):

    def __init__(self, root_dir, names_file, transform=None):
        self.root_dir = root_dir

        self.names_file = names_file
        self.transform = transform
        self.size = 0
        self.names_list = []

        if not os.path.isfile(self.names_file):
            print(self.names_file + 'does not exist!')
        file = open(self.names_file)
        for f in file:
            self.names_list.append(f)
            self.size += 1

    def __len__(self):
        return self.size

    def __getitem__(self, idx):
        image_path = self.root_dir + self.names_list[idx].split(' ')[0]

        if not os.path.isfile(image_path):
            print(image_path + 'does not exist!')
            return None
        image = cv.imread(image_path)  # use cv
        #cv 和 plt 的彩色图像的通道顺序不一样

        label = int(self.names_list[idx].split(' ')[1])

        sample = {'image': image, 'label': label} #建立一个字典变量
        if self.transform:
            sample = self.transform(sample)

        return sample


train_dataset = MyDataset(root_dir='./data1/test',
                          names_file='./data1/test/all_list.txt',
                          transform=None)


print(train_dataset.__len__())

train_dataset = MyDataset(root_dir='./data1/test',
                          names_file='./data1/test/all_list.txt',
                          transform=None)


'''
num_workers --- 多线程
'''
train_dataloader = DataLoader(train_dataset,batch_size=1,
                              shuffle=True)
'''
可视化洗牌后的dataloader
make_grid用于把几个图像按照网格排列的方式绘制出来
'''


def show_images_batch(sample_batched):
    images_batch, labels_batch = \
        sample_batched['image'], sample_batched['label']
    grid = make_grid(images_batch)

    plt.imshow(grid.numpy().transpose(0, 1, 2)) #transpose(0, 1, 2),改变通道顺序


# sample_batch: Tensor , NxCxHxW
plt.figure()
i = 1
for i_batch, sample_batch in enumerate(train_dataloader):
    plt.subplot(2, 2, i)
    show_images_batch(sample_batch)

    plt.axis('off')
    plt.ioff()
    i += 1

plt.show()

5、TRANSFORMS---变换

Data does not always come in its final processed form that is required for training machine learning algorithms. We use transforms to perform some manipulation of the data and make it suitable for training.

All TorchVision datasets have two parameters -transform to modify the features and target_transform to modify the labels - that accept callables containing the transformation logic. The torchvision.transforms module offers several commonly-used transforms out of the box.

The FashionMNIST features are in PIL Image format, and the labels are integers. For training, we need the features as normalized tensors, and the labels as one-hot encoded tensors. To make these transformations, we use ToTensor and Lambda.

何为独热编码

import torch
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda



ds = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(), #ToTensor converts a PIL image or NumPy ndarray into a FloatTensor. and scales the image’s pixel intensity values in the range [0., 1.]
    target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))#Lambda transforms apply any user-defined lambda function. 
)

6、BUILD THE NEURAL NETWORK

Neural networks comprise(组成) of layers/modules that perform operations on data. The torch.nn namespace provides all the building blocks you need to build your own neural network. Every module in PyTorch subclasses the nn.Module. A neural network is a module itself that consists of other modules (layers). This nested structure allows for building and managing complex architectures easily.

torch.nn命名空间提供了构建自己的神经网络所需的所有构建块。torch.nn.Module是所有神经网络模块的基类。神经网络是由其他模块(层)组成的模块本身。

样例代码,we’ll build a neural network to classify images in the FashionMNIST dataset.

import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

'''
GPU
'''
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f'Using {device} device')

'''
We define our neural network by subclassing nn.Module, and initialize the neural network layers in __init__. 
Every nn.Module subclass implements the operations on input data in the forward method.
神经网络,继承nn.module

重写初始化函数和输入数据处理的函数
'''
class NeuralNetwork(nn.Module):
    def __init__(self): #初始化
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten() #convert each 2D 28x28 image into a contiguous array of 784 pixel values
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )

    def forward(self, x): #对输入数据进行操作
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

'''
We create an instance(实例) of NeuralNetwork, and move it to the device, 
and print its structure.
'''
model = NeuralNetwork().to(device)
print(model) #print its structure,打印神经网络的结构

'''
To use the model, we pass it the input data. This executes the model’s forward,
along with some background operations. Do not call model.forward() directly!

Calling the model on the input returns a 10-dimensional tensor with raw predicted values for each class.
We get the prediction probabilities by passing it through an instance of the nn.Softmax module.

获得预测
'''
X = torch.rand(1, 28, 28, device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")

分模块的解释

#分模块的解释
'''
Let’s break down the layers in the FashionMNIST model. 
To illustrate it, we will take a sample minibatch of 3 images of size 28x28 
and see what happens to it as we pass it through the network.
'''

input_image = torch.rand(3,28,28)#a sample minibatch of 3 images of size 28x28
print(input_image.size())

'''
We initialize the nn.Flatten layer to convert each 2D 28x28 image 
into a contiguous array of 784 pixel values ( the minibatch dimension (at dim=0) is maintained).
'''
flatten = nn.Flatten()
flat_image = flatten(input_image) #将输入的图像数据进行修改
print(flat_image.size())

'''
The linear layer is a module that applies a linear transformation (线性变换)
on the input using its stored weights and biases(权值和阈值).
'''
layer1 = nn.Linear(in_features=28*28, out_features=20) #输入的尺寸是28 x 28上面的图像的尺寸  输出 20
hidden1 = layer1(flat_image)
print(hidden1.size())

'''
Non-linear activations are what create the complex mappings between the model’s inputs and outputs. 
helping neural networks learn a wide variety of phenomena.增强网络的学习范围
'''
print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")

'''
nn.Sequential is an ordered container(有序的容器) of modules.
The data is passed through all the modules in the same order as defined. 
You can use sequential containers to put together a quick network like seq_modules.
'''
seq_modules = nn.Sequential( #例子
    flatten,
    layer1,
    nn.ReLU(),
    nn.Linear(20, 10)
)
input_image = torch.rand(3,28,28) #随机的数据
logits = seq_modules(input_image) # can use sequential containers to put together a quick network like seq_modules.

'''
The last linear layer of the neural network returns logits
- raw values(原始值) in [-infty, infty] - which are passed(传递) to the nn.Softmax module. 
The logits are scaled(被归一化为) to values [0, 1] representing the model’s predicted probabilities for each 
class. dim parameter indicates the dimension along which the values must sum to 1. 
logots 代表预测的类别的可能性
'''
softmax = nn.Softmax(dim=1) #dim parameter indicates the dimension along which the values must sum to 1.
pred_probab = softmax(logits) #passed(传递) to the nn.Softmax module

'''
 we iterate over each parameter, and print its size and a preview(预览) of its values.
'''
print("Model structure: ", model, "\n\n")

for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")

7、Autograd---自动求导

AUTOMATIC DIFFERENTIATION WITH TORCH.AUTOGRAD

When training neural networks, the most frequently used algorithm is back propagation(反向回归算法). In this algorithm, parameters (model weights) are adjusted according to the gradient(梯度) of the loss function(损失函数) with respect to(关于) the given parameter.

To compute those gradients, PyTorch has a built-in differentiation engine called torch.autograd. It supports automatic computation of gradient for any computational graph.可以自动求梯度

例子

import torch

'''
Consider the simplest one-layer neural network, with input x, parameters w and b, 
and some loss function. It can be defined in PyTorch in the following manner:
'''
x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y) #损失函数

'''
.grad_fn,A reference(引用) to the backward propagation function is stored in grad_fn property of a tensor. 
'''
print('Gradient function for z =', z.grad_fn) #打印 使用什么样的反向传波函数
print('Gradient function for loss =', loss.grad_fn)

'''
To optimize weights of parameters in the neural network, 
we need to compute the derivatives of our loss function with respect to parameters, 
'''
loss.backward()
print(w.grad) #loss 对  w 的偏导
print(b.grad)

'''
Disabling Gradient Tracking 关闭梯度追踪
当你需要,
To mark some parameters in your neural network as frozen parameters.
To speed up computations 
会用到关闭梯度追踪

如果we only want to do forward computations through the network. 
We can stop tracking computations by surrounding our computation code with
'''
z = torch.matmul(x, w)+b
print(z.requires_grad)

with torch.no_grad(): #关闭
    z = torch.matmul(x, w)+b
print(z.requires_grad)

#Another way to achieve the same result use the detach()
z = torch.matmul(x, w)+b
z_det = z.detach() #关闭
print(z_det.requires_grad)

More on Computational Graphs

Conceptually, autograd keeps a record of data (tensors) and all executed operations(保存数据和所有执行的操作) (along with the resulting new tensors) in a directed acyclic graph (DAG,有向无环图) consisting of Function objects. In this DAG, leaves are the input tensors, roots are the output tensors. By tracing this graph from roots to leaves, you can automatically compute the gradients using the chain rule. 

Note:

DAG 是动态的,在每次.backward()调用之后,autograd开始填充一个新的图。这正是允许你在模型中使用控制流语句的原因,并且根据需要我们可以改变其shape,size 

 8、OPTIMIZING MODEL PARAMETERS

Training a model is an iterative process; in each iteration (called an epoch) the model makes a guess about the output, calculates the error in its guess (loss), collects the derivatives(导数) of the error with respect to its parameters (as we saw in the previous section), and optimizes these parameters using gradient descent(梯度下降).

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda


'''
数据集的准备,建立神经网络模型
'''
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

train_dataloader = DataLoader(training_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork()

'''
Hyperparameter---超参数
adjustable parameters that let you control the model optimization process
hyperparameter values can impact model training and convergence(收敛) rates

Number of Epochs - the number times to iterate over the dataset
Batch Size - the number of data samples propagated through the network before the parameters are updated
Learning Rate - how much to update models parameters at each batch/epoch. 
'''

learning_rate = 1e-3
batch_size = 64
epochs = 5

'''
模型的训练过程 
1.optimization loop
epoch --- iteration of the optimization loop
The Train Loop -- iterate over the training dataset and try to converge to optimal parameters.
The Test Loop -- iterate over the test dataset to check if model performance is improving.

2.Loss Function
measures the degree of dissimilarity of obtained result to the target value
we want to minimize during training

3.Optimizer
a process of adjusting model parameters to reduce model error in each training step.
Optimization algorithms define how this process is performed
Inside the training loop, optimization happens in three steps:
Call optimizer.zero_grad() to reset the gradients of model parameters. 
Backpropagate the prediction loss with a call to loss.backward(). 
Once we have our gradients, we call optimizer.step() to adjust the parameters
'''

def train_loop(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    for batch, (X, y) in enumerate(dataloader):
        # Compute prediction and loss
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")


def test_loop(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, correct = 0, 0

    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()

    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")




loss_fn = nn.CrossEntropyLoss() #定义损失函数
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) #初始化优化器

epochs = 10 #训练周期

for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train_loop(train_dataloader, model, loss_fn, optimizer)
    test_loop(test_dataloader, model, loss_fn)
print("Done!")

9、SAVE AND LOAD THE MODEL

①、模型的权重的保存与加载

import torch
import torchvision.models as models


'''
PyTorch models store the learned parameters in an internal state dictionary, called state_dict. 
These can be persisted via the torch.save method:
pytorch 将学习到的参数在一个内部字典中
'''
model = models.vgg16(pretrained=True)
torch.save(model.state_dict(), 'model_weights.pth')

'''
To load model weights, you need to create an instance of the same model first, 
and then load the parameters using load_state_dict() method.
'''

model = models.vgg16() # we do not specify pretrained=True, i.e. do not load default weights
model.load_state_dict(torch.load('model_weights.pth'))
model.eval()
'''
be sure to call model.eval() method确定这个函数的调用
before inferencing to set the dropout and batch normalization layers to evaluation mode.
Failing to do this will yield inconsistent inference results.不这样做会造成不一样的结果
'''

②、Saving and Loading Models with Shapes,模型的形状

import torch
import torchvision.models as models


model = models.vgg16(pretrained=True)
torch.save(model.state_dict(), 'model_weights.pth')

'''
We might want to save the structure of this class together with the model
保存模型的结构

'''
torch.save(model, 'model.pth')  #we can pass model (and not model.state_dict()) to the saving function
model = torch.load('model.pth') #load the model

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值