20天吃透Pytorch

AI耽误的大厨

于 2023-10-20 15:08:00 发布

阅读量224

点赞数

文章标签：人工智能 python 大数据

本文链接：https://blog.csdn.net/weixin_46556352/article/details/133946715

版权

目录
• 20天吃透Pytorch
一、Pytorch的建模流程
•
• 1-1,结构化数据建模流程范例
• 1-2,图片数据建模流程范例
• 1-3,文本数据建模流程范例
• 1-4,时间
序列数据建模流程范例
二、Pytorch的核心概念
• 2-1,张量数据结构
• 2-2,自动微分机制
• 2-3,动态计算图
三、Pytorch的层次结构
• 3-1,低阶API示范
• 3-2,中阶API示范
• 3-3,高阶API示范
四、Pytorch的低阶API
• 4-1,张量的结构操作
• 4-2,张量的数学运算
• 4-3,nn.functional和nn.Module
五、Pytorch的中阶API
• 5-1,Dataset和DataLoader
• 5-2,模型层
• 5-3,损失函数
• 5-4,TensorBoard可视化
六、Pytorch的高阶API
• 6-1,构建模型的3种方法
• 6-2,训练模型的3种方法
• 6-3,使用GPU训练模型
作者
lyhue1991：https://github.com/lyhue1991/eat_pytorch_in_20_days
整理
http://eat.woshinlper.com/

一、Pytorch的建模流程
使用Pytorch实现神经网络模型的一般流程包括：
1，准备数据
2，定义模型
3，训练模型
4，评估模型
5，使用模型
6，保存模型。
对新手来说，其中最困难的部分实际上是
准备数据过程。
我们在实践中通常会遇到的数据类型包括结构化数据，图片数据，文本数据，时间
序列数据。
我们将分别以titanic生存预测问题，cifar2图片分类问题，imdb电影评论分类问题，国内新冠疫
情结束时间
预测问题为例，演示应用Pytorch对这四类数据的建模方法。

1-1,结构化数据建模流程范例
import os
import datetime
#打印时间
def printbar():
nowtime
datetime datetime now() strftime(‘%Y-%m-%d %H:%M:%S’)
print( \n +"======" 8 + %s %nowtime)
#mac系统上pytorch和matplotlib在jupyter中同
时跑需要
更改环境变量
os environ[“KMP_DUPLICATE_LIB_OK”] “TRUE”
一，准备数据
titanic数据集的目标是
根据乘客信息预测他们在Titanic号撞击冰山沉没后能否生存。
结构化数据一般会使用Pandas中的DataFrame进行预处理。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
from torch import
from torch.utils.data import Dataset DataLoader TensorDataset
dftrain_raw
dftest_raw
pd read_csv(‘./data/titanic/train.csv’)
pd read_csv(‘./data/titanic/test.csv’)
dftrain_raw head(10)
字段说明：
• Survived:0代表死亡，1代表存活【y标签】
• Pclass:乘客所持票
类，有三种值
(1,2,3) 【转换成onehot编码】
• Name:乘客姓名【舍去】
• Sex:乘客性别【转换成bool特征】
• Age:乘客年龄(有缺失) 【数值
特征，添加“年龄是
否缺失”作为辅助特征】
• SibSp:乘客兄弟姐妹/配偶的个数(整数值
) 【数值
特征】
• Parch:乘客父母/孩子的个数(整数值
)【数值
特征】
• Ticket:票
号(字符串)【舍去】
• Fare:乘客所持票
的价格(浮点数，0-500不等) 【数值
特征】
• Cabin:乘客所在船舱(有缺失) 【添加“所在船舱是
否缺失”作为辅助特征】
• Embarked:乘客登船港口:S、C、Q(有缺失)【转换成onehot编码，四维度 S,C,Q,nan】
利用Pandas的数据可视化功能我们可以简单地进行探索性数据分析EDA（Exploratory Data
Analysis）。
label分布情况
%matplotlib inline
%config InlineBackend figure_format
‘png’
dftrain_raw[‘Survived’] value_counts() plot(kind
figsize (12 8),fontsize 15 rot 0)
‘bar’
ax set_ylabel(‘Counts’ fontsize
ax set_xlabel(‘Survived’ fontsize
plt show()
15)
15)
年龄分布情况
%matplotlib inline
%config InlineBackend figure_format
ax dftrain_raw[‘Age’] plot(kind
figsize
‘png’
‘hist’ bins
20 color ‘purple’
(12 8),fontsize 15)
ax set_ylabel(‘Frequency’ fontsize
15)
ax set_xlabel(‘Age’ fontsize
plt show()
15)
年龄和label的相关性
%matplotlib inline
%config InlineBackend figure_format
ax dftrain_raw query(‘Survived == 0’)[‘Age’] plot(kind
figsize (12 8),fontsize 15)
dftrain_raw query(‘Survived == 1’)[‘Age’] plot(kind
figsize (12 8),fontsize 15)
ax legend(['Survived0’ 'Survived1’],fontsize 12)
set_ylabel(‘Density’ fontsize 15)
set_xlabel(‘Age’ fontsize 15)
plt show()
‘png’
‘density’
‘density’
下面
为正式的数据预处理

def preprocessing(dfdata):
dfresult pd DataFrame()
#Pclass
dfPclass
pd get_dummies(dfdata[‘Pclass’])
dfPclass columns
[‘Pclass_’ +str( ) for
in dfPclass columns ]
1)
dfresult
pd concat([dfresult dfPclass],axis
#Sex
dfSex
pd get_dummies(dfdata[‘Sex’])
pd concat([dfresult dfSex],axis
dfresult
1)
#Age
dfresult[‘Age’]
dfdata[‘Age’] fillna(0)
dfresult[‘Age_null’]
pd isna(dfdata[‘Age’]) astype(‘int32’)
#SibSp,Parch,Fare
dfresult[‘SibSp’]
dfresult[‘Parch’]
dfdata[‘SibSp’]
dfdata[‘Parch’]
dfresult[‘Fare’]
dfdata[‘Fare’]
#Carbin
dfresult[‘Cabin_null’]
#Embarked
pd isna(dfdata[‘Cabin’]) astype(‘int32’)
dfEmbarked
dfEmbarked columns
dfresult pd concat([dfresult dfEmbarked],axis
pd get_dummies(dfdata[‘Embarked’],dummy_na True)
[‘Embarked_’ + str( ) for
in dfEmbarked columns]
1)
return(dfresult)
x_train
preprocessing(dftrain_raw) values
dftrain_raw[[‘Survived’]] values
y_train
x_test
y_test
preprocessing(dftest_raw) values
dftest_raw[[‘Survived’]] values
print(“x_train.shape =” x_train shape )
print(“x_test.shape =” x_test shape )
print(“y_train.shape =” y_train shape )
print(“y_test.shape =” y_test shape )
x_train.shape = (712, 15)
x_test.shape = (179, 15)
y_train.shape = (712, 1)
y_test.shape = (179, 1)
进一步使用DataLoader和TensorDataset封装成可以迭代的数据管道。
dl_train
DataLoader(TensorDataset(torch tensor(x_train) float(),torch tensor(y_train) floa
shuffle
True batch_size
8)

dl_valid
DataLoader(TensorDataset(torch tensor(x_test) float(),torch tensor(y_test) float(
shuffle
False batch_size
8)

测试数据管道

for features labels in dl_train:
print(features labels)
break
tensor([[ 0.0000,
0.0000,
0.0000,
0.0000,
1.0000,
7.8958,
0.0000,
1.0000,
1.0000,
0.0000,
0.0000,
0.0000,
1.0000,
1.0000,
0.0000],
[
[
[
[
[
[
[
1.0000,
0.0000,
0.0000],
1.0000,
1.0000,
0.0000],
1.0000,
0.0000,
0.0000],
0.0000,
0.0000,
0.0000],
0.0000,
0.0000,
0.0000],
0.0000,
0.0000,
0.0000],
1.0000,
0.0000,
0.0000,
0.0000,
0.0000,
0.0000,
1.0000,
0.0000,
0.0000,
0.0000,
1.0000,
1.0000,
0.0000, 30.5000,
0.0000,
0.0000,
1.0000,
0.0000,
0.0000, 31.0000,
1.0000, 0.0000,
0.0000,
0.0000,
0.0000, 113.2750,
0.0000,
0.0000,
0.0000,
1.0000,
1.0000, 60.0000,
0.0000, 0.0000,
0.0000,
1.0000,
0.0000, 26.5500,
0.0000,
1.0000,
0.0000,
1.0000,
1.0000, 28.0000,
0.0000, 0.0000,
0.0000,
1.0000,
0.0000, 22.5250,
0.0000,
0.0000,
1.0000,
8.3625,
0.0000,
1.0000,
1.0000, 32.0000,
0.0000, 0.0000,
0.0000,
1.0000,
1.0000,
0.0000,
1.0000,
1.0000,
0.0000, 28.0000,
0.0000, 0.0000,
0.0000,
1.0000,
0.0000, 13.0000,
0.0000,
1.0000, 512.3292,
0.0000]]) tensor([[0.],
0.0000,
0.0000,
0.0000,
1.0000, 36.0000,
1.0000, 0.0000,
0.0000,
0.0000,
[1.],
[1.],
[0.],
[0.],
[0.],
[1.],
[1.]])
二，定义模型
使用Pytorch通常有三种方式构建模型：使用nn.Sequential按层顺
序构建模型，继承nn.Module
基
类构建自定义模型，继承nn.Module基
类构建模型并辅助应用模型容器进行封装。
此处选择使用最简单的nn.Sequential，按层顺
序模型。

def create_net():
net
nn Sequential()
net add_module(“linear1” nn Linear(15 20))
net add_module(“relu1” nn ReLU())
net add_module(“linear2” nn Linear(20 15))
net add_module(“relu2”
net add_module(“linear3”
net add_module(“sigmoid”
return net
ReLU())
Linear(15 1))
Sigmoid())
net
create_net()
print(net)
Sequential(
(linear1): Linear(in_features=15, out_features=20, bias=True)
(relu1): ReLU()
(linear2): Linear(in_features=20, out_features=15, bias=True)
(relu2): ReLU()
(linear3): Linear(in_features=15, out_features=1, bias=True)
(sigmoid): Sigmoid()
)
from torchkeras import summary
summary(net input_shape (15,))
Layer (type)
Output Shape
Param #
Linear-1
ReLU-2
[-1, 20]
[-1, 20]
[-1, 15]
[-1, 15]
[-1, 1]
[-1, 1]
320
0
Linear-3
ReLU-4
315
0
Linear-5
Sigmoid-6
16
0
Total params: 651
Trainable params: 651
Non-trainable params: 0
Input size (MB): 0.000057
Forward/backward pass size (MB): 0.000549
Params size (MB): 0.002483
Estimated Total Size (MB): 0.003090
三，训练模型
Pytorch通常需要
用户编写自定义训练循环，训练循环的代码风格因人而异。
有3类典型的训练循环代码风格：脚本形式训练循环，函数形式训练循环，类形式训练循环。

此处介绍一种较通用的脚本形式。
from sklearn.metrics import accuracy_score
loss_func
optimizer
metric_func
nn BCELoss()
torch optim Adam(params net parameters(),lr
lambda y_pred y_true
0.01)
accuracy_score(y_true data numpy(),y_pred data numpy() 0.5)
metric_name “accuracy”
epochs
10
log_step_freq
30
pd DataFrame(columns
dfhistory
[“epoch” “loss” metric_name “val_loss” “val_” metric_name])
print(“Start Training…”)
nowtime
datetime datetime
() strftime(‘%Y-%m-%d %H:%M:%S’)
print(“==========” 8 + %s %nowtime)
for epoch in range(1 epochs 1):

1，训练循环

net train()
loss_sum
0.0
0.0
metric_sum
step
1
for step (features labels) in enumerate(dl_train 1):

梯度清

零
optimizer zero_grad()

正向传播求损失

predictions
net(features)
loss
loss_func(predictions labels)
metric_func(predictions labels)
metric

反向传播求梯度

loss backward()
optimizer step()

打印batch级别日志

loss_sum += loss item()
metric_sum += metric item()
if step%log_step_freq
0:
print(("[step = %d] loss: %.3f, “+metric_name+”: %.3f ) %
(step loss_sum/step metric_sum/step))

2，验证循环

net eval()
val_loss_sum
val_metric_sum
0.0
0.0
val_step
1
for val_step (features labels) in enumerate(dl_valid 1):

predictions
val_loss
net(features)
loss_func(predictions labels)
metric_func(predictions labels)
val_metric
val_loss_sum += val_loss item()
val_metric_sum val_metric item()

3，记录日志

info
(epoch loss_sum/step metric_sum/step
val_loss_sum/val_step val_metric_sum/val_step)
dfhistory loc[epoch 1]
info

打印epoch级别日志

print(( \nEPOCH = %d, loss = %.3f,"+ metric_name +
"
= %.3f, val_loss = %.3f, “+“val_”+ metric_name+” = %.3f )
%info)
nowtime datetime datetime now() strftime(‘%Y-%m-%d %H:%M:%S’)
print( \n “==========” 8
%s %nowtime)
print(‘Finished Training…’)
Start Training…
================================================================================2
20:53:49
[step = 30] loss: 0.703, accuracy: 0.583
[step = 60] loss: 0.629, accuracy: 0.675
EPOCH = 1, loss = 0.643,accuracy = 0.673, val_loss = 0.621, val_accuracy =
0.725
================================================================================2
20:53:49
[step = 30] loss: 0.653, accuracy: 0.662
[step = 60] loss: 0.624, accuracy: 0.673
EPOCH = 2, loss = 0.621,accuracy = 0.669, val_loss = 0.519, val_accuracy =
0.708
================================================================================2
20:53:49
[step = 30] loss: 0.582, accuracy: 0.688
[step = 60] loss: 0.555, accuracy: 0.723
EPOCH = 3, loss = 0.543,accuracy = 0.740, val_loss = 0.516, val_accuracy =
0.741
================================================================================2
20:53:49
[step = 30] loss: 0.563, accuracy: 0.721
[step = 60] loss: 0.528, accuracy: 0.752
EPOCH = 4, loss = 0.515,accuracy = 0.764, val_loss = 0.471, val_accuracy =
0.777
================================================================================2

20:53:50
[step = 30] loss: 0.433, accuracy: 0.783
[step = 60] loss: 0.477, accuracy: 0.785
EPOCH = 5, loss = 0.489,accuracy = 0.785, val_loss = 0.447, val_accuracy =
0.804
================================================================================2
20:53:50
[step = 30] loss: 0.460, accuracy: 0.812
[step = 60] loss: 0.477, accuracy: 0.798
EPOCH = 6, loss = 0.474,accuracy = 0.798, val_loss = 0.451, val_accuracy =
0.772
================================================================================2
20:53:50
[step = 30] loss: 0.516, accuracy: 0.792
[step = 60] loss: 0.496, accuracy: 0.779
EPOCH = 7, loss = 0.473,accuracy = 0.794, val_loss = 0.485, val_accuracy =
0.783
================================================================================2
20:53:50
[step = 30] loss: 0.472, accuracy: 0.779
[step = 60] loss: 0.487, accuracy: 0.794
EPOCH = 8, loss = 0.474,accuracy = 0.791, val_loss = 0.446, val_accuracy =
0.788
================================================================================2
20:53:50
[step = 30] loss: 0.492, accuracy: 0.771
[step = 60] loss: 0.445, accuracy: 0.800
EPOCH = 9, loss = 0.464,accuracy = 0.796, val_loss = 0.519, val_accuracy =
0.746
================================================================================2
20:53:50
[step = 30] loss: 0.436, accuracy: 0.796
[step = 60] loss: 0.460, accuracy: 0.794
EPOCH = 10, loss = 0.462,accuracy = 0.787, val_loss = 0.415, val_accuracy =
0.810
================================================================================2
20:53:51
Finished Training…
四，评估模型
我们首先评估一下模型在训练集和验证集上的效果。

dfhistory
%matplotlib inline
%config InlineBackend figure_format
‘svg’
import matplotlib.pyplot
plt
def plot_metric(dfhistory metric):
train_metrics
val_metrics
dfhistory[metric]
dfhistory[‘val_’ metric]
epochs
range(1 len(train_metrics)
1)
plt plot(epochs train_metrics ‘bo–’)
plt plot(epochs val_metrics ‘ro-’)
plt title('Training and validation '+ metric)
plt xlabel(“Epochs”)
plt ylabel(metric)
plt legend([“train_” metric ‘val_’ metric])
plt show()
plot_metric(dfhistory “loss”)

plot_metric(dfhistory “accuracy”)

五，使用模型
#预测概率
y_pred_probs
y_pred_probs
net(torch tensor(x_test[0:10]) float()) data
tensor([[0.0119],
[0.6029],
[0.2970],
[0.5717],
[0.5034],
[0.8655],
[0.0572],
[0.9182],
[0.5038],
[0.1739]])
#预测类别
y_pred
torch where(y_pred_probs 0.5
torch ones_like(y_pred_probs),torch zeros_like(y_pred_probs))
y_pred
tensor([[0.],
[1.],
[0.],
[1.],
[1.],
[1.],
[0.],
[1.],
[1.],
[0.]])
六，保存模型
Pytorch 有两种保存模型的方式，都是
通过调用pickle序列化方法实现的。
第一种方法只保存模型参数。
第二种方法保存完整模型。
推荐使用第一种，第二种方法可能在切换设备和目录的时候出现各种问题。
1，保存模型参数(推荐)
print(net state_dict() keys())

odict_keys([‘linear1.weight’, ‘linear1.bias’, ‘linear2.weight’,
‘linear2.bias’, ‘linear3.weight’, ‘linear3.bias’])

保存模型参数

torch save(net state_dict(), “./data/net_parameter.pkl”)
net_clone
create_net()
net_clone load_state_dict(torch load(“./data/net_parameter.pkl”))
net_clone forward(torch tensor(x_test[0:10]) float()) data
tensor([[0.0119],
[0.6029],
[0.2970],
[0.5717],
[0.5034],
[0.8655],
[0.0572],
[0.9182],
[0.5038],
[0.1739]])
2，保存完整模型(不推荐)
torch
(net ‘./data/net_model.pkl’)
net_loaded
torch load(‘./data/net_model.pkl’)
net_loaded(torch tensor(x_test[0:10]) float()) data
tensor([[0.0119],
[0.6029],
[0.2970],
[0.5717],
[0.5034],
[0.8655],
[0.0572],
[0.9182],
[0.5038],
[0.1739]])



 


 
1-2,图片数据建模流程范例
import os
import datetime
#打印时间
def printbar():
nowtime
datetime datetime now() strftime('%Y-%m-%d %H:%M:%S')
print( \n +"==========" 8 + %s %nowtime)
#mac系统上pytorch和matplotlib在jupyter中同
时跑需要
更改环境变量
os environ["KMP_DUPLICATE_LIB_OK"] "TRUE"
一，准备数据
cifar2数据集为cifar10数据集的子集，只包括前两种类别airplane和automobile。
训练集有airplane和automobile图片各5000张，测试集有airplane和automobile图片各1000张。
cifar2任务的目标是
训练一个模型来对飞机airplane和机动车automobile两种图片进行分类。
我们准备的Cifar2数据集的文件结构如下所示。


 
在Pytorch中构建图片数据管道通常有两种方法。
第一种是
使用 torchvision中的datasets.ImageFolder来读取图片然后用 DataLoader来并行加
载。
第二种是通过继承 torch.utils.data.Dataset 实现用户自定义读取逻辑然后用 DataLoader来并行
加载。
第二种方法是读取用户自定义数据集的通用方法，既可以读取图片数据集，也可以读取文本数据
集。
本篇我们介绍第一种方法。
import torch
from torch import
from torch.utils.data import Dataset DataLoader
from torchvision import transforms datasets
transform_train
transforms Compose(
transforms Compose(
[transforms ToTensor()])
transform_valid
[transforms ToTensor()])
ds_train
datasets ImageFolder("./data/cifar2/train/"
transform transform_train target_transform lambda
t:torch tensor([t]) float())
ds_valid datasets ImageFolder("./data/cifar2/test/"
transform transform_train target_transform lambda
t:torch tensor([t]) float())
print(ds_train class_to_idx)
{'0_airplane': 0, '1_automobile': 1}
dl_train
dl_valid
DataLoader(ds_train batch_size
DataLoader(ds_valid batch_size
50 shuffle
50 shuffle
True num_workers 3)
True num_workers 3)
%matplotlib inline
%config InlineBackend figure_format
'svg'
#查看
部分样本
from matplotlib import pyplot as plt
plt figure(figsize (8 8))
for i in range(9):
img label
ds_train[i]


 
img
img permute(1 2 0)
ax plt subplot(3 3 i 1)
ax imshow(img numpy())
set_title("label = %d %label item())
set_xticks([])
ax set_yticks([])
plt show()
# Pytorch的图片默认顺
序是
Batch,Channel,Width,Height
for
y in dl_train
print( shape y shape)
break
torch.Size([50, 3, 32, 32]) torch.Size([50, 1])


 
二，定义模型
使用Pytorch通常有三种方式构建模型：使用nn.Sequential按层顺
序构建模型，继承nn.Module
基
类构建自定义模型，继承nn.Module基
类构建模型并辅助应用模型容器
(nn.Sequential,nn.ModuleList,nn.ModuleDict)进行封装。
此处选择通过继承nn.Module基
类构建自定义模型。
#测试AdaptiveMaxPool2d的效果
pool
t
nn AdaptiveMaxPool2d((1 1))
torch randn(10 8 32 32)
pool(t) shape
torch.Size([10, 8, 1, 1])
class Net(nn Module):
def __init__(self):
super(Net self) __init__()
self conv1
self pool
Conv2d(in_channels 3 out_channels 32 kernel_size
nn MaxPool2d(kernel_size 2 stride 2)
nn Conv2d(in_channels 32 out_channels 64 kernel_size
3)
5)
self conv2
self dropout
nn Dropout2d(p
0.1)
self adaptive_pool
nn AdaptiveMaxPool2d((1 1))
self flatten
self linear1
self relu
nn Flatten()
Linear(64 32)
ReLU()
self linear2
self sigmoid
Linear(32 1)
nn Sigmoid()
def forward(self x):
x
self conv1(x)
self pool( )
self conv2( )
self pool( )
x
x
x
x
self dropout(x)
self adaptive_pool(x)
self flatten(x)
self linear1(x)
self relu( )
self linear2( )
self sigmoid( )
y
return y
net
Net()
print(net)
Net(
(conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1,
ceil_mode=False)
(conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))


 
(dropout): Dropout2d(p=0.1, inplace=False)
(adaptive_pool): AdaptiveMaxPool2d(output_size=(1, 1))
(flatten): Flatten()
(linear1): Linear(in_features=64, out_features=32, bias=True)
(relu): ReLU()
(linear2): Linear(in_features=32, out_features=1, bias=True)
(sigmoid): Sigmoid()
)
import torchkeras
torchkeras summary(net input_shape (3 32 32))
Layer (type)
Output Shape
Param #
Conv2d-1
MaxPool2d-2
Conv2d-3
[-1, 32, 30, 30]
[-1, 32, 15, 15]
[-1, 64, 11, 11]
[-1, 64, 5, 5]
[-1, 64, 5, 5]
[-1, 64, 1, 1]
[-1, 64]
896
0
51,264
MaxPool2d-4
Dropout2d-5
AdaptiveMaxPool2d-6
Flatten-7
0
0
0
0
Linear-8
[-1, 32]
2,080
ReLU-9
[-1, 32]
0
33
0
Linear-10
[-1, 1]
Sigmoid-11
[-1, 1]
Total params: 54,273
Trainable params: 54,273
Non-trainable params: 0
Input size (MB): 0.011719
Forward/backward pass size (MB): 0.359634
Params size (MB): 0.207035
Estimated Total Size (MB): 0.578388
三，训练模型
Pytorch通常需要
用户编写自定义训练循环，训练循环的代码风格因人而异。
有3类典型的训练循环代码风格：脚本形式训练循环，函数形式训练循环，类形式训练循环。
此处介绍一种较通用的函数形式训练循环。
import pandas as pd
from sklearn.metrics import roc_auc_score
model
net
model optimizer
model loss_func
model metric_func
torch optim SGD(model parameters(),lr
torch BCELoss()
lambda y_pred y_true:
0.01)


 
roc_auc_score(y_true data numpy(),y_pred data numpy())
model metric_name "auc"
def train_step(model features labels):
# 训练模式，dropout层发生作用
model train()
# 梯度清
零
model optimizer zero_grad()
# 正向传播求损失
predictions
model(features)
loss
model loss_func(predictions labels)
model metric_func(predictions labels)
metric
# 反向传播求梯度
loss backward()
model optimizer step()
return loss item(),metric item()
def valid_step(model features labels):
# 预测模式，dropout层不发生作用
model eval()
predictions
model(features)
loss
model loss_func(predictions labels)
model metric_func(predictions labels)
metric
return loss item(), metric item()
# 测试train_step效果
features labels
next(iter(dl_train))
train_step(model features labels)
(0.6922046542167664, 0.5088566827697262)
def train_model(model epochs dl_train dl_valid log_step_freq):
metric_name
dfhistory
model metric_name
pd DataFrame(columns
["epoch" "loss" metric_name "val_loss" "val_" metric_name])
print("Start Training...")
nowtime
datetime datetime now() strftime('%Y-%m-%d %H:%M:%S')
print("==========" 8 + %s %nowtime)
for epoch in range(1 epochs+1):
# 1，训练循环
loss_sum
0.0
0.0
metric_sum


 
step
for step (features labels) in enumerate(dl_train 1):
loss metric train_step(model features labels)
1
# 打印batch级别日志
loss_sum
loss
metric_sum += metric
if step%log_step_freq
0
print(("[step = %d] loss: %.3f, "+metric_name+": %.3f ) %
(step loss_sum/step metric_sum/step))
# 2，验证循环
val_loss_sum
0.0
0.0
val_metric_sum
val_step
for val_step (features labels) in enumerate(dl_valid 1):
val_loss val_metric valid_step(model features labels)
1
val_loss_sum += val_loss
val_metric_sum += val_metric
# 3，记录日志
info
(epoch loss_sum/step metric_sum/step
val_loss_sum/val_step val_metric_sum/val_step)
dfhistory loc[epoch 1]
info
# 打印epoch级别日志
print(( \nEPOCH = %d, loss = %.3f,"+ metric_name + \
"
= %.3f, val_loss = %.3f, "+"val_"+ metric_name+" = %.3f )
%info)
nowtime datetime datetime now() strftime('%Y-%m-%d %H:%M:%S')
print( \n "==========" 8
print('Finished Training...')
return dfhistory
%s %nowtime)
epochs
dfhistory
20
train_model(model epochs dl_train dl_valid log_step_freq
50)
Start Training...
================================================================================2
20:47:56
[step = 50] loss: 0.691, auc: 0.627
[step = 100] loss: 0.690, auc: 0.673
[step = 150] loss: 0.688, auc: 0.699
[step = 200] loss: 0.686, auc: 0.716
EPOCH = 1, loss = 0.686,auc = 0.716, val_loss = 0.678, val_auc = 0.806


 
================================================================================2
20:48:18
[step = 50] loss: 0.677, auc: 0.780
[step = 100] loss: 0.675, auc: 0.775
[step = 150] loss: 0.672, auc: 0.782
[step = 200] loss: 0.669, auc: 0.779
EPOCH = 2, loss = 0.669,auc = 0.779, val_loss = 0.651, val_auc = 0.815
================================================================================2
20:54:24
[step = 50] loss: 0.386, auc: 0.914
[step = 100] loss: 0.392, auc: 0.913
[step = 150] loss: 0.395, auc: 0.911
[step = 200] loss: 0.398, auc: 0.911
EPOCH = 19, loss = 0.398,auc = 0.911, val_loss = 0.449, val_auc = 0.924
================================================================================2
20:54:43
[step = 50] loss: 0.416, auc: 0.917
[step = 100] loss: 0.417, auc: 0.916
[step = 150] loss: 0.404, auc: 0.918
[step = 200] loss: 0.402, auc: 0.918
EPOCH = 20, loss = 0.402,auc = 0.918, val_loss = 0.535, val_auc = 0.925
================================================================================2
20:55:03
Finished Training...
四，评估模型
dfhistory


 
%matplotlib inline
%config InlineBackend figure_format
'svg'
import matplotlib.pyplot as plt
def plot_metric(dfhistory metric):
train_metrics
val_metrics
dfhistory[metric]
dfhistory['val_'+metric]
epochs
range(1 len(train_metrics)
1)
plt plot(epochs train_metrics 'bo--')
plt plot(epochs val_metrics 'ro-')
plt title('Training and validation '+ metric)
plt xlabel("Epochs")
plt ylabel(metric)
plt legend(["train_"+metric 'val_'+metric])
plt show()
plot_metric(dfhistory "loss")


 
plot_metric(dfhistory "auc")
五，使用模型


 
def predict(model dl):
model eval()
result
torch cat([model forward(t[0]) for t in dl])
return(result data)
#预测概率
y_pred_probs
y_pred_probs
predict(model dl_valid)
tensor([[8.4032e-01],
[1.0407e-02],
[5.4146e-04],
[1.4471e-02],
[1.7673e-02],
[4.5081e-01]])
#预测类别
y_pred
torch where(y_pred_probs 0.5
torch ones_like(y_pred_probs),torch zeros_like(y_pred_probs))
y_pred
tensor([[1.],
[0.],
[0.],
[0.],
[0.],
[0.]])
六，保存模型
推荐使用保存参数方式保存Pytorch模型。
print(model state_dict() keys())
odict_keys(['conv1.weight', 'conv1.bias', 'conv2.weight', 'conv2.bias',
'linear1.weight', 'linear1.bias', 'linear2.weight', 'linear2.bias'])
# 保存模型参数
torch save(model state_dict(), "./data/model_parameter.pkl")
net_clone
Net()
net_clone load_state_dict(torch load("./data/model_parameter.pkl"))


 
predict(net_clone dl_valid)
tensor([[0.0204],
[0.7692],
[0.4967],
[0.6078],
[0.7182],
[0.8251]])


 


 
1-3,文本数据建模流程范例
一，准备数据
imdb数据集的目标是
根据电影评论的文本内容预测评论的情感标签。
训练集有20000条电影评论文本，测试集有5000条电影评论文本，其中正面
评论和负面
评论都各
占一半。
文本数据预处理较为繁琐，包括中文切词（本示例不涉及），构建词典，编码转换，序列填充，
构建数据管道等等。
在torch中预处理文本数据一般使用torchtext或者自定义Dataset，torchtext功能非常强大，可以
构建文本分类，序列标注，问答模型，机器翻译等NLP任务的数据集。
下面
仅演示使用它来构建文本分类数据集的方法。
较完整的教程可以参考以下知乎文章
：《pytorch学习笔记—Torchtext》
https://zhuanlan.zhihu.com/p/65833208
torchtext常见API一览
• torchtext.data.Example : 用来表示一个样本，数据和标签
• torchtext.vocab.Vocab: 词汇表，可以导入一些预训练词向量
• torchtext.data.Datasets: 数据集类， __getitem__返回 Example实例,
torchtext.data.TabularDataset是
其子类。
• torchtext.data.Field : 用来定义字段的处理方法（文本字段，标签字段）创建 Example时的
预处理，batch 时的一些处理操作。
• torchtext.data.Iterator: 迭代器，用来生成 batch
• torchtext.datasets: 包含了常见的数据集.


 
import torch
import string re
import torchtext
MAX_WORDS
MAX_LEN
10000 # 仅考虑最高频的10000个词
200 # 每个样本保留200个词的长度
BATCH_SIZE
20
#分词方法
tokenizer
lambda x:re sub('[%s]'%string punctuation
x) split(" ")
#过滤掉低频词
def filterLowFreqWords(
vocab):
[[ if
for example in arr]
return arr
MAX_WORDS else 0 for
in example]
#1,定义各个字段的预处理方法
TEXT
torchtext data Field(sequential True tokenize tokenizer lower True
fix_length MAX_LEN postprocessing
filterLowFreqWords)
LABEL
torchtext data Field(sequential False use_vocab False)
#2,构建表格型dataset
#torchtext.data.TabularDataset可读取csv,tsv,json等格式
ds_train ds_test
torchtext data TabularDataset splits(
path './data/imdb' train 'train.tsv' test 'test.tsv' format 'tsv'
fields [('label' LABEL), ('text' TEXT)],skip_header False)
#3,构建词典
TEXT build_vocab(ds_train)
#4,构建数据管道迭代器
train_iter test_iter
torchtext data Iterator splits(
(ds_train ds_test), sort_within_batch True sort_key lambda x:
len(x text),
batch_sizes (BATCH_SIZE BATCH_SIZE))
#查看
example信息
print(ds_train[0] text)
print(ds_train[0] label)
['it', 'really', 'boggles', 'my', 'mind', 'when', 'someone', 'comes',
'across', 'a', 'movie', 'like', 'this', 'and', 'claims', 'it', 'to', 'be',
'one', 'of', 'the', 'worst', 'slasher', 'films', 'out', 'there', 'this',
'is', 'by', 'far', 'not', 'one', 'of', 'the', 'worst', 'out', 'there',
'still', 'not', 'a', 'good', 'movie', 'but', 'not', 'the', 'worst',
'nonetheless', 'go', 'see', 'something', 'like', 'death', 'nurse', 'or',
'blood', 'lake', 'and', 'then', 'come', 'back', 'to', 'me', 'and', 'tell',
'me', 'if', 'you', 'think', 'the', 'night', 'brings', 'charlie', 'is', 'the',
'worst', 'the', 'film', 'has', 'decent', 'camera', 'work', 'and', 'editing',
'which', 'is', 'way', 'more', 'than', 'i', 'can', 'say', 'for', 'many',
'more', 'extremely', 'obscure', 'slasher', 'filmsbr', 'br', 'the', 'film',
'doesnt', 'deliver', 'on', 'the', 'onscreen', 'deaths', 'theres', 'one',
'death', 'where', 'you', 'see', 'his', 'pruning', 'saw', 'rip', 'into', 'a',


 
'neck', 'but', 'all', 'other', 'deaths', 'are', 'hardly', 'interesting',
'but', 'the', 'lack', 'of', 'onscreen', 'graphic', 'violence', 'doesnt',
'mean', 'this', 'isnt', 'a', 'slasher', 'film', 'just', 'a', 'bad', 'onebr',
'br', 'the', 'film', 'was', 'obviously', 'intended', 'not', 'to', 'be',
'taken', 'too', 'seriously', 'the', 'film', 'came', 'in', 'at', 'the', 'end',
'of', 'the', 'second', 'slasher', 'cycle', 'so', 'it', 'certainly', 'was',
'a', 'reflection', 'on', 'traditional', 'slasher', 'elements', 'done', 'in',
'a', 'tongue', 'in', 'cheek', 'way', 'for', 'example', 'after', 'a', 'kill',
'charlie', 'goes', 'to', 'the', 'towns', 'welcome', 'sign', 'and', 'marks',
'the', 'population', 'down', 'one', 'less', 'this', 'is', 'something',
'that', 'can', 'only', 'get', 'a', 'laughbr', 'br', 'if', 'youre', 'into',
'slasher', 'films', 'definitely', 'give', 'this', 'film', 'a', 'watch', 'it',
'is', 'slightly', 'different', 'than', 'your', 'usual', 'slasher', 'film',
'with', 'possibility', 'of', 'two', 'killers', 'but', 'not', 'by', 'much',
'the', 'comedy', 'of', 'the', 'movie', 'is', 'pretty', 'much', 'telling',
'the', 'audience', 'to', 'relax', 'and', 'not', 'take', 'the', 'movie', 'so',
'god', 'darn', 'serious', 'you', 'may', 'forget', 'the', 'movie', 'you',
'may', 'remember', 'it', 'ill', 'remember', 'it', 'because', 'i', 'love',
'the', 'name']
0
# 查看
词典信息
print(len(TEXT vocab))
#itos: index to string
print(TEXT vocab itos[0])
print(TEXT vocab itos[1])
#stoi: string to index
print(TEXT vocab stoi['<unk>']) #unknown 未知词
print(TEXT vocab stoi['<pad>']) #padding 填充
#freqs: 词频
print(TEXT vocab freqs['<unk>'])
print(TEXT vocab freqs['a'])
print(TEXT vocab freqs['good'])
108197
<unk>
<pad>
0
1
0
129453
11457
# 查看
数据管道信息
# 注意有坑：text第0维是
句子长度
for batch in train_iter:
features
labels
batch text
batch label
print(features)
print(features shape)


 
print(labels)
break
tensor([[ 17,
31, 148, ...,
2, 904, ..., 335,
[1371, 1737, 44, ..., 806,
54,
11, 201],
7, 109],
[
2,
2,
11],
[
6,
[ 170,
15,
5,
0,
0,
62, ...,
27, ...,
45, ...,
1,
1,
1,
1,
1,
1,
1],
1],
[
1]])
torch.Size([200, 20])
tensor([0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0])
# 将数据管道组织成torch.utils.data.DataLoader相似的features,label输出形式
class DataLoader:
def __init__(self data_iter):
self data_iter
self length
data_iter
len(data_iter)
def __len__(self):
return self length
def __iter__(self):
# 注意：此处调整features为 batch first 并调整label的shape和dtype
for batch in self data_iter:
yield(torch transpose(batch text 0 1),
torch unsqueeze(batch label float(),dim
1))
dl_train
dl_test
DataLoader(train_iter)
DataLoader(test_iter)
二，定义模型
使用Pytorch通常有三种方式构建模型：使用nn.Sequential按层顺
序构建模型，继承nn.Module
基
类构建自定义模型，继承nn.Module基
类构建模型并辅助应用模型容器
(nn.Sequential,nn.ModuleList,nn.ModuleDict)进行封装。
此处选择使用第三种方式进行构建。
由于接下来使用类形式的训练循环，我们将模型封装成torchkeras.Model类来获得类似Keras中
高阶模型接口的功能。
Model类实际上继承自nn.Module类。
import torch
from torch import
import torchkeras
torch random seed()
import torch


 
from torch import nn
class Net(torchkeras Model):
def __init__(self):
super(Net self) __init__()
#设置padding_idx参数后将在训练过程中将填充的token始终赋值
为0向量
self embedding
3 padding_idx 1)
self conv
self conv add_module("conv_1" nn Conv1d(in_channels
Embedding(num_embeddings
MAX_WORDS embedding_dim
Sequential()
3 out_channels
16 kernel_size
5))
self conv add_module("pool_1" nn MaxPool1d(kernel_size
self conv add_module("relu_1" nn ReLU())
self conv add_module("conv_2" nn Conv1d(in_channels
2))
16 out_channels
128 kernel_size
2))
self conv add_module("pool_2"
MaxPool1d(kernel_size 2))
self conv add_module("relu_2" nn ReLU())
self dense nn Sequential()
self dense add_module("flatten" nn Flatten())
self dense add_module("linear" nn Linear(6144 1))
self dense add_module("sigmoid"
Sigmoid())
def forward(self x):
x
x
y
self embedding(x) transpose(1 2)
self conv(x)
self dense(x)
return y
model
Net()
print(model)
model summary(input_shape
(200,),input_dtype
torch LongTensor)
Net(
(embedding): Embedding(10000, 3, padding_idx=1)
(conv): Sequential(
(conv_1): Conv1d(3, 16, kernel_size=(5,), stride=(1,))
(pool_1): MaxPool1d(kernel_size=2, stride=2, padding=0, dilation=1,
ceil_mode=False)
(relu_1): ReLU()
(conv_2): Conv1d(16, 128, kernel_size=(2,), stride=(1,))
(pool_2): MaxPool1d(kernel_size=2, stride=2, padding=0, dilation=1,
ceil_mode=False)
(relu_2): ReLU()
)
(dense): Sequential(
(flatten): Flatten()
(linear): Linear(in_features=6144, out_features=1, bias=True)
(sigmoid): Sigmoid()
)
)


 
Layer (type)
Output Shape
Param #
Embedding-1
Conv1d-2
MaxPool1d-3
ReLU-4
[-1, 200, 3]
[-1, 16, 196]
[-1, 16, 98]
[-1, 16, 98]
[-1, 128, 97]
[-1, 128, 48]
[-1, 128, 48]
[-1, 6144]
30,000
256
0
0
Conv1d-5
MaxPool1d-6
ReLU-7
4,224
0
0
Flatten-8
Linear-9
Sigmoid-10
0
6,145
0
[-1, 1]
[-1, 1]
Total params: 40,625
Trainable params: 40,625
Non-trainable params: 0
Input size (MB): 0.000763
Forward/backward pass size (MB): 0.287796
Params size (MB): 0.154972
Estimated Total Size (MB): 0.443531
三，训练模型
训练Pytorch通常需要
用户编写自定义训练循环，训练循环的代码风格因人而异。
有3类典型的训练循环代码风格：脚本形式训练循环，函数形式训练循环，类形式训练循环。
此处介绍一种类形式的训练循环。
我们仿照Keras定义了一个高阶的模型接口Model,实现 ﬁt, validate，predict, summary 方法，相
当于用户自定义高阶API。
# 准确率
def accuracy(y_pred y_true):
y_pred
torch where(y_pred 0.5 torch ones_like(y_pred dtype
torch float32),
torch zeros_like(y_pred dtype
(1 torch abs(y_true y_pred))
torch float32))
torch
return acc
model compile(loss_func
torch optim Adagrad(model parameters(),lr
metrics_dict {"accuracy":accuracy})
nn BCELoss(),optimizer
0.02),
# 有时候模型训练过程中不收敛，需要
多试几次
dfhistory
model fit(20 dl_train dl_val dl_test log_step_freq 200)
Start Training ...
================================================================================2


 
17:53:56
{'step': 200, 'loss': 1.127, 'accuracy': 0.504}
{'step': 400, 'loss': 0.908, 'accuracy': 0.517}
{'step': 600, 'loss': 0.833, 'accuracy': 0.531}
{'step': 800, 'loss': 0.793, 'accuracy': 0.545}
{'step': 1000, 'loss': 0.765, 'accuracy': 0.56}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
1
| 0.765 |
0.56
|
0.64
|
0.64
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:54:23
{'step': 200, 'loss': 0.626, 'accuracy': 0.659}
{'step': 400, 'loss': 0.621, 'accuracy': 0.662}
{'step': 600, 'loss': 0.616, 'accuracy': 0.664}
{'step': 800, 'loss': 0.61, 'accuracy': 0.671}
{'step': 1000, 'loss': 0.603, 'accuracy': 0.677}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
2
| 0.603 | 0.677
|
0.577
|
0.705
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:54:50
{'step': 200, 'loss': 0.545, 'accuracy': 0.726}
{'step': 400, 'loss': 0.538, 'accuracy': 0.735}
{'step': 600, 'loss': 0.532, 'accuracy': 0.737}
{'step': 800, 'loss': 0.531, 'accuracy': 0.737}
{'step': 1000, 'loss': 0.528, 'accuracy': 0.739}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
3
| 0.528 | 0.739
|
0.536
|
0.739
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:55:18
{'step': 200, 'loss': 0.488, 'accuracy': 0.773}
{'step': 400, 'loss': 0.482, 'accuracy': 0.774}
{'step': 600, 'loss': 0.482, 'accuracy': 0.773}
{'step': 800, 'loss': 0.479, 'accuracy': 0.773}
{'step': 1000, 'loss': 0.473, 'accuracy': 0.776}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
4
| 0.473 | 0.776
|
0.504
|
0.766
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:55:45


 
{'step': 200, 'loss': 0.446, 'accuracy': 0.789}
{'step': 400, 'loss': 0.437, 'accuracy': 0.796}
{'step': 600, 'loss': 0.436, 'accuracy': 0.799}
{'step': 800, 'loss': 0.436, 'accuracy': 0.798}
{'step': 1000, 'loss': 0.434, 'accuracy': 0.8}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
5
| 0.434 |
0.8
|
0.481
|
0.774
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:56:12
{'step': 200, 'loss': 0.404, 'accuracy': 0.817}
{'step': 400, 'loss': 0.4, 'accuracy': 0.819}
{'step': 600, 'loss': 0.398, 'accuracy': 0.821}
{'step': 800, 'loss': 0.402, 'accuracy': 0.818}
{'step': 1000, 'loss': 0.402, 'accuracy': 0.817}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
6
| 0.402 | 0.817
|
0.47
|
0.781
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:56:39
{'step': 200, 'loss': 0.369, 'accuracy': 0.834}
{'step': 400, 'loss': 0.374, 'accuracy': 0.833}
{'step': 600, 'loss': 0.373, 'accuracy': 0.834}
{'step': 800, 'loss': 0.374, 'accuracy': 0.834}
{'step': 1000, 'loss': 0.375, 'accuracy': 0.833}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
7
| 0.375 | 0.833
|
0.468
|
0.787
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:57:06
{'step': 200, 'loss': 0.36, 'accuracy': 0.839}
{'step': 400, 'loss': 0.355, 'accuracy': 0.846}
{'step': 600, 'loss': 0.35, 'accuracy': 0.849}
{'step': 800, 'loss': 0.353, 'accuracy': 0.846}
{'step': 1000, 'loss': 0.352, 'accuracy': 0.847}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
8
| 0.352 | 0.847
|
0.461
|
0.791
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:57:33
{'step': 200, 'loss': 0.313, 'accuracy': 0.867}

 
{'step': 400, 'loss': 0.326, 'accuracy': 0.862}
{'step': 600, 'loss': 0.331, 'accuracy': 0.86}
{'step': 800, 'loss': 0.333, 'accuracy': 0.859}
{'step': 1000, 'loss': 0.332, 'accuracy': 0.859}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
9
| 0.332 | 0.859
|
0.462
|
0.789
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:58:00
{'step': 200, 'loss': 0.309, 'accuracy': 0.869}
{'step': 400, 'loss': 0.31, 'accuracy': 0.872}
{'step': 600, 'loss': 0.31, 'accuracy': 0.871}
{'step': 800, 'loss': 0.311, 'accuracy': 0.869}
{'step': 1000, 'loss': 0.314, 'accuracy': 0.869}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
10 | 0.314 | 0.869
|
0.46
|
0.793
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:58:26
{'step': 200, 'loss': 0.3, 'accuracy': 0.88}
{'step': 400, 'loss': 0.293, 'accuracy': 0.881}
{'step': 600, 'loss': 0.297, 'accuracy': 0.878}
{'step': 800, 'loss': 0.299, 'accuracy': 0.877}
{'step': 1000, 'loss': 0.297, 'accuracy': 0.878}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
11 | 0.297 | 0.878
|
0.471
|
0.789
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:58:54
{'step': 200, 'loss': 0.275, 'accuracy': 0.891}
{'step': 400, 'loss': 0.282, 'accuracy': 0.887}
{'step': 600, 'loss': 0.283, 'accuracy': 0.888}
{'step': 800, 'loss': 0.283, 'accuracy': 0.887}
{'step': 1000, 'loss': 0.282, 'accuracy': 0.886}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
12 | 0.282 | 0.886
|
0.465
|
0.795
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:59:22
{'step': 200, 'loss': 0.26, 'accuracy': 0.903}
{'step': 400, 'loss': 0.268, 'accuracy': 0.894}


 
{'step': 600, 'loss': 0.271, 'accuracy': 0.893}
{'step': 800, 'loss': 0.267, 'accuracy': 0.893}
{'step': 1000, 'loss': 0.268, 'accuracy': 0.892}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
13 | 0.268 | 0.892
|
0.472
|
0.794
|
+-------+-------+----------+----------+--------------+
================================================================================2
17:59:49
{'step': 200, 'loss': 0.252, 'accuracy': 0.903}
{'step': 400, 'loss': 0.25, 'accuracy': 0.905}
{'step': 600, 'loss': 0.251, 'accuracy': 0.903}
{'step': 800, 'loss': 0.253, 'accuracy': 0.9}
{'step': 1000, 'loss': 0.255, 'accuracy': 0.9}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
14 | 0.255 |
0.9
|
0.469
|
0.796
|
+-------+-------+----------+----------+--------------+
================================================================================2
18:00:16
{'step': 200, 'loss': 0.242, 'accuracy': 0.912}
{'step': 400, 'loss': 0.237, 'accuracy': 0.911}
{'step': 600, 'loss': 0.24, 'accuracy': 0.91}
{'step': 800, 'loss': 0.241, 'accuracy': 0.908}
{'step': 1000, 'loss': 0.242, 'accuracy': 0.906}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
15 | 0.242 | 0.906
|
0.475
|
0.797
|
+-------+-------+----------+----------+--------------+
================================================================================2
18:00:44
{'step': 200, 'loss': 0.218, 'accuracy': 0.921}
{'step': 400, 'loss': 0.223, 'accuracy': 0.916}
{'step': 600, 'loss': 0.229, 'accuracy': 0.912}
{'step': 800, 'loss': 0.229, 'accuracy': 0.913}
{'step': 1000, 'loss': 0.231, 'accuracy': 0.911}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
16 | 0.231 | 0.911
|
0.486
|
0.794
|
+-------+-------+----------+----------+--------------+
================================================================================2
18:01:12
{'step': 200, 'loss': 0.21, 'accuracy': 0.919}
{'step': 400, 'loss': 0.22, 'accuracy': 0.915}
{'step': 600, 'loss': 0.22, 'accuracy': 0.915}


 
{'step': 800, 'loss': 0.22, 'accuracy': 0.916}
{'step': 1000, 'loss': 0.22, 'accuracy': 0.916}
+-------+------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+------+----------+----------+--------------+
|
17 | 0.22 | 0.916
|
0.486
|
0.796
|
================================================================================2
18:02:24
{'step': 200, 'loss': 0.206, 'accuracy': 0.927}
{'step': 400, 'loss': 0.21, 'accuracy': 0.923}
{'step': 600, 'loss': 0.21, 'accuracy': 0.924}
{'step': 800, 'loss': 0.213, 'accuracy': 0.922}
{'step': 1000, 'loss': 0.21, 'accuracy': 0.923}
+-------+------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+------+----------+----------+--------------+
|
18 | 0.21 | 0.923
|
0.493
|
0.796
|
================================================================================2
18:02:53
{'step': 200, 'loss': 0.191, 'accuracy': 0.932}
{'step': 400, 'loss': 0.197, 'accuracy': 0.926}
{'step': 600, 'loss': 0.199, 'accuracy': 0.928}
{'step': 800, 'loss': 0.199, 'accuracy': 0.927}
{'step': 1000, 'loss': 0.2, 'accuracy': 0.927}
+-------+------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+------+----------+----------+--------------+
|
19 | 0.2
|
0.927
|
0.5
|
0.794
|
================================================================================2
18:03:22
{'step': 200, 'loss': 0.19, 'accuracy': 0.934}
{'step': 400, 'loss': 0.192, 'accuracy': 0.931}
{'step': 600, 'loss': 0.195, 'accuracy': 0.929}
{'step': 800, 'loss': 0.194, 'accuracy': 0.93}
{'step': 1000, 'loss': 0.191, 'accuracy': 0.931}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
|
20 | 0.191 | 0.931
|
0.506
|
0.795
|
+-------+-------+----------+----------+--------------+
================================================================================2
18:03:58
Finished Training...


 
四，评估模型
%matplotlib inline
%config InlineBackend figure_format
'svg'
import matplotlib.pyplot
plt
def plot_metric(dfhistory metric):
train_metrics
val_metrics
dfhistory[metric]
dfhistory['val_' metric]
epochs
range(1 len(train_metrics)
1)
plt plot(epochs train_metrics 'bo--')
plt plot(epochs val_metrics 'ro-')
plt title('Training and validation '+ metric)
plt xlabel("Epochs")
plt ylabel(metric)
plt legend(["train_" metric 'val_' metric])
plt show()
plot_metric(dfhistory "loss")
plot_metric(dfhistory "accuracy")


 
# 评估
model evaluate(dl_test)
{'val_loss': 0.5056138457655907, 'val_accuracy': 0.7948000040054322}
五，使用模型
model predict(dl_test)
tensor([[3.9803e-02],
[9.9295e-01],
[6.0493e-01],
[1.2023e-01],
[9.3701e-01],
[2.5752e-04]])
六，保存模型
推荐使用保存参数方式保存Pytorch模型。
print(model state_dict() keys())


 
odict_keys(['embedding.weight', 'conv.conv_1.weight', 'conv.conv_1.bias',
'conv.conv_2.weight', 'conv.conv_2.bias', 'dense.linear.weight',
'dense.linear.bias'])
# 保存模型参数
torch save(model state_dict(), "./data/model_parameter.pkl")
model_clone
Net()
model_clone load_state_dict(torch load("./data/model_parameter.pkl"))
model_clone compile(loss_func
torch optim Adagrad(model parameters(),lr
metrics_dict {"accuracy":accuracy})
nn BCELoss(),optimizer
0.02),
# 评估模型
model_clone evaluate(dl_test)
{'val_loss': 0.5056138457655907, 'val_accuracy': 0.7948000040054322}


 


 
1-4,时间序列数据建模流程范例
2020年发生的新冠肺炎疫情灾难给各国人民的生活造成了诸多方面
的影响。
有的同
学是
收入上的，有的同
学是
感情上的，有的同
学是
心理上的，还有的同
学是
体重 上的。
本文基
于中国2020年3月之前的疫情数据，建立时间
序列RNN模型，对中国的新冠肺炎疫情结束
时间
进行预测。
import os
import datetime
import importlib
import torchkeras
#打印时间
def printbar():
nowtime
datetime datetime now() strftime('%Y-%m-%d %H:%M:%S')
%s %nowtime)
print( \n "==========" 8
#mac系统上pytorch和matplotlib在jupyter中同
时跑需要
更改环境变量
os environ["KMP_DUPLICATE_LIB_OK"] "TRUE"
一，准备数据
本文的数据集取自tushare，获取该数据集的方法参考了以下文章
。
《https://zhuanlan.zhihu.com/p/109556102》


 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend figure_format
'svg'
df
pd read_csv("./data/covid-19.csv" sep
\t )
df plot(x
6))
"date" y
["confirmed_num" "cured_num" "dead_num"],figsize (10
plt xticks(rotation 60);


 
dfdata
dfdiff
dfdiff
df set_index("date")
dfdata diff(periods 1) dropna()
dfdiff reset_index("date")
dfdiff plot(
"date" y
["confirmed_num" "cured_num" "dead_num"],figsize (10 6))
plt xticks(rotation 60)
dfdiff
dfdiff drop("date" axis
1) astype("float32")


 
dfdiff head()
下面
我们通过继承torch.utils.data.Dataset实现自定义时间
序列数据集。
torch.utils.data.Dataset是
一个抽象类，用户想要
加载自定义的数据只需要
继承这个类，并且覆
写其中的两个方法即可：
__len__:实现len(dataset)返回整个数据集的大小。
•
__getitem__:用来获取一些索引的数据，使 dataset[i]返回数据集中第i个样本。
不覆写这两个方法会直
•
接返回错误。
import torch
from torch import nn
from torch.utils.data import Dataset DataLoader TensorDataset
#用某日前8天窗口数据作为输入预测该日数据
WINDOW_SIZE
8
class Covid19Dataset(Dataset):
def __len__(self):
return len(dfdiff)
WINDOW_SIZE
def __getitem__(self i):
dfdiff loc[i:i+WINDOW_SIZE 1,:]
feature torch tensor( values)
dfdiff loc[i+WINDOW_SIZE,:]
label torch tensor(y values)
y
return (feature label)
ds_train
Covid19Dataset()
#数据较小，可以将全部训练数据放入到一个batch中，提升性能
dl_train
DataLoader(ds_train batch_size
38)


 
二，定义模型
使用Pytorch通常有三种方式构建模型：使用nn.Sequential按层顺
序构建模型，继承nn.Module
基
类构建自定义模型，继承nn.Module基
类构建模型并辅助应用模型容器进行封装。
此处选择第二种方式构建模型。
由于接下来使用类形式的训练循环，我们进一步将模型封装成torchkeras中的Model类来获得类
似Keras中高阶模型接口的功能。
Model类实际上继承自nn.Module类。
import torch
from torch import
import importlib
import torchkeras
torch random seed()
class Block(
Module):
def __init__(self):
super(Block self) __init__()
def forward(self x x_input):
x_out
torch max((1 x) x_input[:, 1,:],torch tensor(0.0))
return x_out
class Net(
Module):
def __init__(self):
super(Net self) __init__()
# 3层lstm
self lstm
LSTM(input_size
Linear(3 3)
3 hidden_size
3 num_layers
5 batch_first
True)
self linear
self block
Block()
def forward(self x_input):
x
self lstm(x_input)[0][:, 1,:]
self linear( )
y
self block( x_input)
return y
net
Net()
torchkeras Model(net)
model
print(model)
model summary(input_shape (8 3),input_dtype
torch FloatTensor)
Net(
(lstm): LSTM(3, 3, num_layers=5, batch_first=True)
(linear): Linear(in_features=3, out_features=3, bias=True)
(block): Block()
)


 
Layer (type)
Output Shape
Param #
LSTM-1
Linear-2
Block-3
[-1, 8, 3]
[-1, 3]
480
12
0
[-1, 3]
Total params: 492
Trainable params: 492
Non-trainable params: 0
Input size (MB): 0.000092
Forward/backward pass size (MB): 0.000229
Params size (MB): 0.001877
Estimated Total Size (MB): 0.002197
三，训练模型
训练Pytorch通常需要
用户编写自定义训练循环，训练循环的代码风格因人而异。
有3类典型的训练循环代码风格：脚本形式训练循环，函数形式训练循环，类形式训练循环。
此处介绍一种类形式的训练循环。
我们仿照Keras定义了一个高阶的模型接口Model,实现 ﬁt, validate，predict, summary 方法，相
当于用户自定义高阶API。
注：循环神经网络调试较为困难，需要
设置多个不同
的学习率多次尝试，以取得较好的效果。
def mspe(y_pred y_true):
err_percent
(torch max(y_true 2 torch tensor(1e-7)))
return torch (err_percent)
(y_true
y_pred) 2/
model compile(loss_func
mspe optimizer
torch optim Adagrad(model parameters(),lr
0.1))
dfhistory
model fit(100 dl_train log_step_freq 10)
四，评估模型
评估模型一般要
设置验证集或者测试集，由于此例数据较少，我们仅仅可视化损失函数在训练集
上的迭代情况。


 
%matplotlib inline
%config InlineBackend figure_format
'svg'
import matplotlib.pyplot as plt
def plot_metric(dfhistory metric):
train_metrics
dfhistory[metric]
epochs range(1 len(train_metrics) + 1)
plt plot(epochs train_metrics 'bo--')
plt title('Training ' metric)
plt xlabel("Epochs")
plt ylabel(metric)
plt legend(["train_"+metric])
plt show()
plot_metric(dfhistory "loss")
五，使用模型
此处我们使用模型预测疫情结束时间
，即 新增确诊病例为0 的时间
。
#使用dfresult记录现有数据以及此后预测的疫情数据
dfresult
dfdiff[["confirmed_num" "cured_num" "dead_num"]] copy()
dfresult tail()


 
#预测此后200天的新增走势 将其结果添加到dfresult中
for i in range(200):
arr_input
torch unsqueeze(torch from_numpy(dfresult values[ 38:,:]),axis 0)
arr_predict
dfpredict
dfresult
model forward(arr_input)
pd DataFrame(torch floor(arr_predict) data numpy(),
columns
dfresult columns)
dfresult append(dfpredict ignore_index True)
dfresult query("confirmed_num==0") head()
# 第50天开始新增确诊降为0，第45天对应3月10日，也就是
5天后，即预计3月15日新增确诊降为0
# 注：该预测偏乐观
dfresult query("cured_num==0") head()
# 第132天开始新增治愈降为0，第45天对应3月10日，也就是
大概3个月后，即6月10日左右全部治愈。
# 注: 该预测偏悲观，并且存在问题，如果将每天新增治愈人数加起来，将超过累计确诊人数。


 
dfresult query("dead_num==0") head()
# 第50天开始新增确诊降为0，第45天对应3月10日，也就是
5天后，即预计3月15日新增确诊降为0
# 注：该预测偏乐观
六，保存模型
推荐使用保存参数方式保存Pytorch模型。
print(model net state_dict() keys())
# 保存模型参数
torch
(model net state_dict(), "./data/model_parameter.pkl")
net_clone
Net()
net_clone load_state_dict(torch load("./data/model_parameter.pkl"))
model_clone torchkeras Model(net_clone)
model_clone compile(loss_func
mspe)
# 评估模型
model_clone evaluate(dl_train)
{'val_loss': 4.254558563232422}

AI耽误的大厨

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
20天吃透Pytorch

目录• 20天吃透Pytorch一、Pytorch的建模流程•• 1-1,结构化数据建模流程范例• 1-2,图片数据建模流程范例• 1-3,文本数据建模流程范例• 1-4,时间序列数据建模流程范例二、Pytorch的核心概念• 2-1,张量数据结构• 2-2,自动微分机制• 2-3,动态计算图三、Pytorch的层次结构• 3-1,低阶API示范• 3-2,中阶API示范• 3-3,高阶API示范四、Pytorch的低阶API• 4-1,张量的结构操作• 4-2,张量的
复制链接

扫一扫