pytorch系列笔记一：初识Pytorch

最新推荐文章于 2025-01-16 13:54:25 发布

ChanZany

最新推荐文章于 2025-01-16 13:54:25 发布

阅读量756

点赞数

分类专栏：神经网络机器学习文章标签：神经网络 python pytorch numpy matplotlib

本文链接：https://blog.csdn.net/qq_41819729/article/details/107475143

版权

神经网络机器学习专栏收录该内容

20 篇文章

订阅专栏

pytorch系列笔记一：初识Pytorch

文章目录

pytorch系列笔记一：初识Pytorch

pytorch安装

annaconda下安装pytorch，使用官方源安装巨慢，还出错

可以用中科大的镜像源，速度提高一万倍

具体做法

conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
conda config --set show_channel_urls yes

然后去官方找你对应的cuda版本去下载，例如

pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html -i https://pypi.mirrors.ustc.edu.cn/simple

如何有这门那门的错误，建议直接离线安装，安装流程可以参考这篇博客

在安装完成后，随便建一个测试文件(.py),然后测试torch模块

import torch

print(torch.__version__)
print(torch.version.cuda)
print(torch.cuda.is_available())
ngpu = 1
device = torch.device("cuda:0" if(torch.cuda.is_available() and ngpu>0) else "cpu")
print(device)
print(torch.cuda.get_device_name(0))
print(torch.rand(3,3).cuda())

如果pytorch安装成功而且是正确的cuda版本，应该会打印如下信息：

1.5.1+cu101
10.1
True
cuda:0
GeForce GTX 1050 Ti
tensor([[0.4271, 0.1155, 0.2633],
        [0.9090, 0.1135, 0.8025],
        [0.4208, 0.9740, 0.2207]], device='cuda:0')

pytorch.tensor vs numpy.ndarray

由于numpy作为数据处理三大神器之一，其自带的ndarray也可以实现数据在n维的表征，这与tensorflow或者pytorch的tensor（张量）是极其相似的，所以pytorch提供了ndarray的数据转换接口，使得torch与numpy能很好的兼容。

其中一些常用的转换API:

tensor<-torch.from_numpy(ndarray):从ndarray转换到tensor,用于搭建模型
ndarray<-tensor.numpy() :从tensor转换到ndarray,常用于可视化
Tensor<-torch.as_tensor(data, dtype=None, device=None): 从任意data (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy ndarray, scalar, and other types.转为tensor.

数学运算

# Torch 中的数学运算
data = [-1, -2, 1, 2]
tensor = torch.FloatTensor(data)
print(
    '\nabs',
    '\nnumpy: ', np.abs(data),  # numpy:  [1 2 1 2] 
    '\ntorch: ', torch.abs(tensor)  # torch:  tensor([1., 2., 1., 2.])
)
print(
    '\nsin',
    '\nnumpy: ', np.sin(data),  #  [-0.84147098 -0.90929743  0.84147098  0.90929743] 
    '\ntorch: ', torch.sin(tensor)  # tensor([-0.8415, -0.9093,  0.8415,  0.9093])
)
print(
    '\nmean',
    '\nnumpy: ', np.mean(data),  # 0.0 
    '\ntorch: ', torch.mean(tensor)  # tensor(0.)
)

矩阵运算

## 矩阵运算
np_data = np.array([[1, 2], [3, 4]])

tensor = torch.FloatTensor(np_data)
print(
    '\nmatrix multiplication(matmul)',
    '\nnumpy matmul: ', np.matmul(np_data, np_data),  # [[ 7 10][15 22]]
    '\ntorch matmul: ', torch.matmul(tensor, tensor),  # tensor[[ 7 10][15 22]]
    '\ntorch mm: ', torch.mm(tensor, tensor),  # tensor[[ 7 10][15 22]]
    '\nnumpy a.dot(b): ', np_data.dot(np_data),  # [[ 7 10][15 22]]
    # '\ntorch a.dot(b): ', tensor.dot(tensor),  # RuntimeError: 1D tensors expected, got 2D,
)

Tensor的创建

Tensor（张量）类似于NumPy的ndarray，但还可以在GPU上使用来加速计算。

torch.tensor的创建方式与ndarray非常相似,下面是一些常用的创建操作：

自定义tensor的创建

torch. tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor

data (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy ndarray, scalar, and other types.

只要是array_like的数据都可用来创建张量tensor,比如python的列表、元组，numpy的ndarray、标量scala。
dtype (torch.dtype, optional) – the desired data type of returned tensor. Default: if None, infers data type from data.

这个就和np.int32,np.float64巴拉巴拉之类的很像
device (torch.device, optional) – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

在创建tensor时指定该tensor基于CPU还是GPU运行
requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default: False.

是否需要记录梯度求解信息
pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: False.

是否利用缓存优化–仅适用于CPU-Tensor

随机样本的创建：

Random sampling creation ops are listed under Random sampling and include: torch.rand() torch.rand_like() torch.randn() torch.randn_like() torch.randint() torch.randint_like() torch.randperm() You may also use torch.empty() with the In-place random sampling methods to create torch.Tensor s with values sampled from a broader range of distributions.

以torch.randon()为例，它的函数构造如下：

torch.randn(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

从均值为0和方差为1的正态分布(也称为标准正态分布)中返回一个填充了随机数的张量。

参数:

size (int…) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.
out (Tensor, optional) – the output tensor.
dtype (torch.dtype, optional) – the desired data type of returned tensor. Default: if None, uses a global default (see torch.set_default_tensor_type()).
layout (torch.layout, optional) – the desired layout of returned Tensor. Default: torch.strided.
device (torch.device, optional) – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default: False.

初始化张量的创建

torch.zeros(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor：创建给定size的zero-tensor
torch.zeros_like(input, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor:创建给定input(tensor) size相同的zero-tensor

>>> torch.zeros(2, 3)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])
>>> input = torch.empty(2, 3)
>>> torch.zeros_like(input)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

torch.ones(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor
torch.ones_like(input, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor

>>> torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])
>>> input = torch.empty(2, 3)
>>> torch.ones_like(input)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

torch.arange(start=0, end, step=1, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor :创建一维tensor,类似numpy.arange
torch.linspace(start, end, steps=100, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor：创建由区间内均匀分布的数据的组成的一维tensor，需要指定start,end,默认生成100个数据点(steps=100)。

print(torch.arange(5,11))
print(torch.arange(5,11).shape)
print(torch.linspace(5,11).shape)
print(torch.linspace(5,11,6))
'''
tensor([ 5,  6,  7,  8,  9, 10])
torch.Size([6])
torch.Size([100])
tensor([ 5.0000,  6.2000,  7.4000,  8.6000,  9.8000, 11.0000])
'''

Magic Metrix的创建

torch.eye(n, m=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor
torch.empty(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False) → Tensor:返回用未初始化数据填充的张量。张量的形状由变量大小定义。
torch.full(size, fill_value, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor:Returns a tensor of size size filled with fill_value.

tensor的属性

当然,torch也提供了超级多API来帮助我们查看/修改tensor的属性

>>> torch.tensor([1.2, 3]).dtype    # initial default for floating point is torch.float32
torch.float32
>>> torch.set_default_tensor_type(torch.DoubleTensor)
>>> torch.tensor([1.2, 3]).dtype    # a new floating point tensor
torch.float64
>>> a = torch.randn(1, 2, 3, 4, 5)
>>> torch.numel(a)
120
>>> a = torch.zeros(4,4)
>>> torch.numel(a)
16

在torch中使用tensor与在numpy中使用ndarray手感完全一致，只不过tensor在ndarray基础上有了许多新的property，比如张量可以使用.to方法移动到任何设备（device）上：

# let us run this cell only if CUDA is available
# 我们将使用`torch.device`来将tensor移入和移出GPU
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # 直接在GPU上创建tensor
    x = x.to(device)                       # 或者使用`.to("cuda")`方法
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # `.to`也能在移动时改变dtype

输出：

tensor([1.0445], device='cuda:0')
tensor([1.0445], dtype=torch.float64)

Autograd：自动求导

为理解自动求导，或者说知道为什么，何时设置tensor的requires_grad属性为true,我们需要先看看神经网络是如何运作的：

在这里插入图片描述

PyTorch中，所有神经网络的核心是autograd包。先简单介绍一下这个包，然后训练我们的第一个的神经网络。

autograd包为张量上的所有操作提供了自动求导机制。PyTorch是一个在运行时定义（define-by-run）的框架，这意味着反向传播是根据代码如何运行来决定的，并且每次迭代可以是不同的.

在前面张量的各个创建方法中，我们几乎都能看到一个属性的存在：requires_grad=false,如果设置它的属性 .requires_grad为True，那么它将会追踪对于该张量的所有操作。当完成计算后可以通过调用.backward()，来自动计算所有的梯度。这个张量的所有梯度将会自动累加到.grad属性.

x = torch.ones(2,2,requires_grad=True)
print("x:",x)
y = x + 2
print("y.grad_fn:",y.grad_fn) # y是计算的结果，所以它有grad_fn属性。
z = y * y * 3
out = z.mean()
print(out) # scalar
out.backward() # 反向传播-计算梯度
print("after backward x's grad become to:",x.grad)

输出：

x: tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
y.grad_fn: <AddBackward0 object at 0x000002B8D1346FC8>
tensor(27., grad_fn=<MeanBackward0>)
after backward x's grad become to: tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])

相反，如果要阻止一个张量被跟踪历史，可以调用.detach()方法将其与计算历史分离，并阻止它未来的计算记录被跟踪。

为了防止跟踪历史记录（和使用内存），也可以将代码块包装在with torch.no_grad():中。在评估模型时特别有用，因为模型可能具有requires_grad = True的可训练的参数，但是我们不需要在此过程中对他们进行梯度计算。

x = torch.ones(2,2,requires_grad=True)
print("x.requires_grad:",x.requires_grad)
print("(x ** 2).requires_grad:",(x ** 2).requires_grad)
with torch.no_grad():
    print("with torch.no_grade,(x ** 2).requires_grad:",(x ** 2).requires_grad)

还有一个类对于autograd的实现非常重要：Function。

Tensor和Function互相连接生成了一个非循环图，它编码了完整的计算历史。每个张量都有一个.grad_fn属性，它引用了一个创建了这个Tensor的Function（除非这个张量是用户手动创建的，即这个张量的grad_fn是None）。

如果需要计算导数，可以在Tensor上调用.backward()。如果Tensor是一个标量（即它包含一个元素的数据），则不需要为backward()指定任何参数，但是如果它有更多的元素，则需要指定一个gradient参数，它是形状匹配的张量。

# 创建一个二元函数，即z=f(x,y)=x^2+y^2，x可求导，y设置不可求导
x = torch.tensor(3.0, requires_grad=True)
y = torch.tensor(4.0, requires_grad=False)
# y = torch.tensor(4.0, requires_grad=True)
z = torch.pow(x, 2) + torch.pow(y, 2)

# 判断x,y是否是可以求导的
print(x.requires_grad)
print(y.requires_grad)
print(z.requires_grad)

# 求导，通过backward函数来实现
z.backward()

# 查看导数，也即所谓的梯度
print(x.grad)
print(y.grad)

'''运行结果为：
True       # x是可导的
False      # y是不可导的
True       # z是可导的，因为它有一个 leaf variable 是可导的，即x可导
tensor(6.) # x的导数
None       # 因为y不可导，所以是none
'''

总结：

对于需要计算梯度的变量，现在不需要把它用Variable包裹了，只需要设定requires_grad=True即可，默认为False以节省不必要的计算记录
对于不需要计算梯度的变量也可以手动声明，官方建议使用with torch.no_grad():来包括其中的计算过程。

tensor(requires_grad=True) 计算时, 它在背景幕布后面一步步默默地搭建着一个庞大的系统, 叫做计算图, computational graph. 这个图是用来干嘛的? 原来是将所有的计算步骤 (节点) 都连接起来, 最后进行误差反向传递的时候, 一次性将所有 tensor(requires_grad=True) 里面的修改幅度 (梯度) 都计算出来, 而未声明requires_grad=True 的tensor就没有这个能力啦.

Pytorch的激励函数

import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F     # 激励函数都在这

# 做一些假数据来观看图像
X = torch.linspace(-5, 5, 200,requires_grad=True)  # x data (tensor), shape=(100, 1)
X_np = X.data.numpy()# 将tensor换为ndarray,用于画图
# 几种常用的激励函数
y_relu = F.relu(input=X).data.numpy()
y_sigmoid = F.sigmoid(input=X).data.numpy()
y_tanh = F.tanh(input=X).data.numpy()
y_softplus = F.softplus(input=X).data.numpy()
y_softmax = F.softmax(input=X).data.numpy()


# 可视化激励函数
plt.figure(1,figsize=(8,6))
plt.subplot(221)
plt.plot(X_np,y_relu,c='red',label='relu')
plt.ylim((-1,5))
plt.legend(loc ="best")

plt.subplot(222)
plt.plot(X_np,y_sigmoid,c='red',label='sigmoid')
plt.ylim((-0.2,1.2))
plt.legend(loc ="best")

plt.subplot(223)
plt.plot(X_np,y_tanh,c='red',label='tanh')
plt.ylim((-1.2,1.2))
plt.legend(loc ="best")

plt.subplot(224)
plt.plot(X_np,y_softplus,c='red',label='softplus')
plt.ylim((-0.1,6))
plt.legend(loc ="best")
plt.savefig("activateFunctions")
plt.show()

在这里插入图片描述

回归模型搭建

建立一个神经网络我们可以直接运用 torch 中的体系. 先定义所有的层属性(__init__()), 然后再一层层搭建(forward(x))层于层的关系链接. 建立关系的时候, 我们会用到激励函数。

import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F
import warnings

warnings.filterwarnings("ignore")

# 自定义数据集
X = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)
# unsqueeze把一维的数据按指定轴展开为二维数据
y = X.pow(2) + 0.2 * torch.rand(X.size())  # noisy y data (tensor), shape=(100, 1)


# 可视化数据
# plt.scatter(X.data.numpy(), y.data.numpy())
# plt.show()

# 定义神经网络
class LRNN(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        '''
        定义网络的layers，以及layer之间的关系
        n_feature-->n_hidden-->n_output
        :param n_feature: 神经网络的输入
        :param n_hidden:  Hidden Layer的输出
        :param n_output:  预测结果输出
        '''
        super(LRNN, self).__init__()
        self.hidden = torch.nn.Linear(in_features=n_feature, out_features=n_hidden)  # 隐藏层为线性模型
        self.predict = torch.nn.Linear(n_hidden, n_output)

    def forward(self, X):
        y_hidden = F.relu(self.hidden(X))  # 经过hidden layer与激活函数relu
        y_predict = self.predict(y_hidden)
        return y_predict


# 训练神经网络
linerNet = LRNN(1, 10, 1)
print(linerNet)
'''
LRNN(
  (hidden): Linear(in_features=1, out_features=10, bias=True)
  (predict): Linear(in_features=10, out_features=1, bias=True)
)
'''
optimizer = torch.optim.SGD(linerNet.parameters(), lr=0.5)  # 使用随机梯度下降法优化模型参数
loss_func = torch.nn.MSELoss()  # 使用均方差定义损失函数


###可视化#####
plt.ion()
plt.show()
## 开始训练
for t in range(200):
    y_predict = linerNet(X)
    loss = loss_func(y_predict, y)  # y_predict在前，y_true在后
    optimizer.zero_grad()  # 初始化梯度为0
    loss.backward()  # 计算各待优化参数的梯度
    optimizer.step()  # 根据上一步计算的梯度优化相应的模型参数

    if t % 5 == 0:
        plt.cla()
        plt.scatter(X.data.numpy(), y.data.numpy())
        plt.plot(X.data.numpy(), y_predict.data.numpy(), 'r-', lw=5)
        plt.text(0.5, 0, 'Loss=%.4f' % loss.data.numpy(), fontdict={'size': 20, 'color': 'red'})
        plt.pause(0.1)

plt.ioff()
plt.show()

结果如下：

LRNN(
  (hidden): Linear(in_features=1, out_features=10, bias=True)
  (predict): Linear(in_features=10, out_features=1, bias=True)
)

在这里插入图片描述

分类模型搭建

import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F
import warnings
import numpy as np

warnings.filterwarnings("ignore")

# 自定义数据集
n_data = torch.ones(100, 2)
x0 = torch.normal(2 * n_data, 1)
y0 = torch.zeros(100)
x1 = torch.normal(-2 * n_data, 1)
y1 = torch.ones(100)

X = torch.cat((x0, x1), 0).type(torch.FloatTensor)  # torch.Size([200, 2])
y = torch.cat((y0, y1), ).type(torch.LongTensor)  # torch.Size([200])


## 可视化数据
# plt.scatter(X.data.numpy()[:,0],X.data.numpy()[:,1],c=y.data.numpy(),s=100, lw=0, cmap='RdYlGn')
# plt.show()


# 定义神经网络
class LCNN(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        '''
        定义网络的layers，以及layer之间的关系
        n_feature-->n_hidden-->n_output
        :param n_feature: 神经网络的输入
        :param n_hidden:  Hidden Layer的输出
        :param n_output:  预测结果输出
        '''
        super(LCNN, self).__init__()
        self.hidden = torch.nn.Linear(in_features=n_feature, out_features=n_hidden)  # 隐藏层为线性模型
        self.predict = torch.nn.Linear(n_hidden, n_output)

    def forward(self, X):
        y_hidden = F.relu(self.hidden(X))  # 经过hidden layer与激活函数relu
        y_predict = self.predict(y_hidden)
        return y_predict




net = LCNN(2, 10, 2)
'''
输出为[0,1]则分类为1
输出为[1,0]则分类为0
'''
plt.ion()
plt.show()

optimizer = torch.optim.SGD(net.parameters(), lr=0.02)
loss_func = torch.nn.CrossEntropyLoss()  # 交叉熵

for t in range(100):
    y_predict = net(X)
    loss = loss_func(y_predict, y)
    optimizer.zero_grad()# 清空上一步的残余更新参数值
    loss.backward() # 误差反向传播, 计算参数更新值
    optimizer.step()  # 将参数更新值施加到 net 的 parameters 上
    if t % 2 == 0:
        plt.cla()
        tmp = torch.max(F.softmax(y_predict), 1)[1]  # 通过softmax将预测值转换为判定的概率
        y_predict = tmp.data.numpy().squeeze()
        y_true = y.data.numpy()
        plt.scatter(X.data.numpy()[:, 0], X.data.numpy()[:, 1], c=y.data.numpy(), s=100, lw=0, cmap='RdYlGn')
        accuracy = sum(y_predict == y_true) / 200
        X_np = X.data.numpy()
        x_min, x_max = X_np[0, :].min() - 1, X_np[0, :].max() + 0.1
        y_min, y_max = X_np[1, :].min() - 1, X_np[1, :].max() + 0.1

        plt.text(1.5, -4, 'Accuracy=%.2f' % accuracy, fontdict={'size': 20, 'color': 'red'})
        plt.pause(0.1)

plt.ioff()
plt.show()

输出：

在这里插入图片描述

快速搭建法

import torch
import torch.nn.functional as F


class LCNN(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        super(LCNN, self).__init__()
        self.hidden = torch.nn.Linear(in_features=n_feature, out_features=n_hidden)  # 隐藏层为线性模型
        self.predict = torch.nn.Linear(n_hidden, n_output)

    def forward(self, X):
        y_hidden = F.relu(self.hidden(X))  # 经过hidden layer与激活函数relu
        y_predict = self.predict(y_hidden)
        return y_predict


net1 = LCNN(2, 10, 2)

##	 通过序列模型快速	##
##	 搭建神经网络		 ##
net2 = torch.nn.Sequential(
    torch.nn.Linear(2, 10),  # Hidden Layer's input_num and output_num
    torch.nn.ReLU(),  # after Hidden layer output,the data flow to ReLU
    torch.nn.Linear(10, 2)  # OutputLayer
)

print(net1)
print(net2)

保存/提取模型

import torch
import matplotlib.pyplot as plt

torch.manual_seed(1)  # reproducible

# 假数据
x = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)  # x data (tensor), shape=(100, 1)
y = x.pow(2) + 0.2*torch.rand(x.size())  # noisy y data (tensor), shape=(100, 1)


def save():
    # 建网络
    net1 = torch.nn.Sequential(
        torch.nn.Linear(1, 10),
        torch.nn.ReLU(),
        torch.nn.Linear(10, 1)
    )
    optimizer = torch.optim.SGD(net1.parameters(), lr=0.2)
    loss_func = torch.nn.MSELoss()

    # 训练
    for t in range(200):
        prediction = net1(x)
        loss = loss_func(prediction, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    # plot result
    prediction = net1(x)
    plt.figure(1, figsize=(10, 3))
    plt.subplot(131)
    plt.title('Net1')
    plt.scatter(x.data.numpy(), y.data.numpy())
    plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)

    # 保存模型方式1--entire net
    torch.save(net1, "net.pkl")
    # 保存模型方式2--net's parameters
    torch.save(net1.state_dict(), "net_params.pkl")


# 提取：entire net
def restore_net():
    net2 = torch.load("net.pkl")
    prediction = net2(x)
    # plot result
    plt.figure(1, figsize=(10, 3))
    plt.subplot(132)
    plt.title("Net 2")
    plt.scatter(x.data.numpy(), y.data.numpy())
    plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)


# 提取：net params
def restore_params():
    # 先建立相同结构的模型
    net3 = torch.nn.Sequential(
        torch.nn.Linear(1, 10),
        torch.nn.ReLU(),
        torch.nn.Linear(10, 1)
    )
    # 然后提取参数并赋予模型
    net3.load_state_dict(torch.load("net_params.pkl"))
    prediction = net3(x)
    # plot result
    plt.figure(1, figsize=(10, 3))
    plt.subplot(133)
    plt.title("Net 3")
    plt.scatter(x.data.numpy(), y.data.numpy())
    plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)
    plt.show()


save()
restore_net()
restore_params()

可视化输出：

在这里插入图片描述

pytorch-tutorial

转自https://github.com/yunjey/pytorch-tutorial

import torch 
import torchvision
import torch.nn as nn
import numpy as np
import torchvision.transforms as transforms


# ================================================================== #
#                         Table of Contents                          #
# ================================================================== #

# 1. Basic autograd example 1               (Line 25 to 39)
# 2. Basic autograd example 2               (Line 46 to 83)
# 3. Loading data from numpy                (Line 90 to 97)
# 4. Input pipline                          (Line 104 to 129)
# 5. Input pipline for custom dataset       (Line 136 to 156)
# 6. Pretrained model                       (Line 163 to 176)
# 7. Save and load model                    (Line 183 to 189) 


# ================================================================== #
#                     1. Basic autograd example 1                    #
# ================================================================== #

# Create tensors.
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)

# Build a computational graph.
y = w * x + b    # y = 2 * x + 3

# Compute gradients.
y.backward()

# Print out the gradients.
print(x.grad)    # x.grad = 2 
print(w.grad)    # w.grad = 1 
print(b.grad)    # b.grad = 1 


# ================================================================== #
#                    2. Basic autograd example 2                     #
# ================================================================== #

# Create tensors of shape (10, 3) and (10, 2).
x = torch.randn(10, 3)
y = torch.randn(10, 2)

# Build a fully connected layer.
linear = nn.Linear(3, 2)
print ('w: ', linear.weight)
print ('b: ', linear.bias)

# Build loss function and optimizer.
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)

# Forward pass.
pred = linear(x)

# Compute loss.
loss = criterion(pred, y)
print('loss: ', loss.item())

# Backward pass.
loss.backward()

# Print out the gradients.
print ('dL/dw: ', linear.weight.grad) 
print ('dL/db: ', linear.bias.grad)

# 1-step gradient descent.
optimizer.step()

# You can also perform gradient descent at the low level.
# linear.weight.data.sub_(0.01 * linear.weight.grad.data)
# linear.bias.data.sub_(0.01 * linear.bias.grad.data)

# Print out the loss after 1-step gradient descent.
pred = linear(x)
loss = criterion(pred, y)
print('loss after 1 step optimization: ', loss.item())


# ================================================================== #
#                     3. Loading data from numpy                     #
# ================================================================== #

# Create a numpy array.
x = np.array([[1, 2], [3, 4]])

# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)

# Convert the torch tensor to a numpy array.
z = y.numpy()


# ================================================================== #
#                         4. Input pipeline                           #
# ================================================================== #

# Download and construct CIFAR-10 dataset.
train_dataset = torchvision.datasets.CIFAR10(root='../../data/',
                                             train=True, 
                                             transform=transforms.ToTensor(),
                                             download=True)

# Fetch one data pair (read data from disk).
image, label = train_dataset[0]
print (image.size())
print (label)	

# Data loader (this provides queues and threads in a very simple way).
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=64, 
                                           shuffle=True)

# When iteration starts, queue and thread start to load data from files.
data_iter = iter(train_loader)

# Mini-batch images and labels.
images, labels = data_iter.next()

# Actual usage of the data loader is as below.
for images, labels in train_loader:
    # Training code should be written here.
    pass


# ================================================================== #
#                5. Input pipeline for custom dataset                 #
# ================================================================== #

# You should build your custom dataset as below.
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self):
        # TODO
        # 1. Initialize file paths or a list of file names. 
        pass
    def __getitem__(self, index):
        # TODO
        # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
        # 2. Preprocess the data (e.g. torchvision.Transform).
        # 3. Return a data pair (e.g. image and label).
        pass
    def __len__(self):
        # You should change 0 to the total size of your dataset.
        return 0 

# You can then use the prebuilt data loader. 
custom_dataset = CustomDataset()
train_loader = torch.utils.data.DataLoader(dataset=custom_dataset,
                                           batch_size=64, 
                                           shuffle=True)


# ================================================================== #
#                        6. Pretrained model                         #
# ================================================================== #

# Download and load the pretrained ResNet-18.
resnet = torchvision.models.resnet18(pretrained=True)

# If you want to finetune only the top layer of the model, set as below.
for param in resnet.parameters():
    param.requires_grad = False

# Replace the top layer for finetuning.
resnet.fc = nn.Linear(resnet.fc.in_features, 100)  # 100 is an example.

# Forward pass.
images = torch.randn(64, 3, 224, 224)
outputs = resnet(images)
print (outputs.size())     # (64, 100)


# ================================================================== #
#                      7. Save and load the model                    #
# ================================================================== #

# Save and load the entire model.
torch.save(resnet, 'model.ckpt')
model = torch.load('model.ckpt')

# Save and load only the model parameters (recommended).
torch.save(resnet.state_dict(), 'params.ckpt')
resnet.load_state_dict(torch.load('params.ckpt'))