nn.Sequential，torch.nn.Module和torch.autograd.Function

最新推荐文章于 2022-03-04 23:21:09 发布

jk英菲尼迪

最新推荐文章于 2022-03-04 23:21:09 发布

阅读量349

点赞数

分类专栏： pytorch 使用

原文链接：https://blog.csdn.net/qq_27825451/article/details/90705328

版权

pytorch 使用专栏收录该内容

26 篇文章 6 订阅

订阅专栏

参考自：https://blog.csdn.net/qq_27825451/category_8866856.html

对于简单模型，我们可以使用torch.nn.Sequential类来实现简单的顺序连接模型，这个模型也是继承自Module类的。

https://blog.csdn.net/qq_27825451/article/details/90551513

对于稍复杂的模型，需自己来定义一个模型或者自定义层来实现了

pytorch中其实一般没有特别明显的Layer和Module的区别，不管是自定义层、自定义块、自定义模型，都是通过继承Module类完成的，这一点很重要。其实Sequential类也是继承自Module类的。

torch里面实现神经网络有两种方式

（1）高层API方法：使用torch.nn.****来实现；

（2）低层API方法：使用低层函数方法，torch.nn.functional.****来实现；

推荐使用高层API的方法，原因如下：

高层API是使用类的形式来包装的，既然是类就可以存储参数，比如全连接层的权值矩阵、偏置矩阵等都可以作为类的属性存储着，但是低层API仅仅是实现函数的运算功能，没办法保存这些信息，会丢失参数信息，但是高层API是依赖于低层API的计算函数的，比如上面的两个层：

Linear高级层——>低层F.linear() 函数

Conv2d高级层——>低层F.conv2d()函数

自定义层的步骤

要实现一个自定义层大致分以下几个主要的步骤：

（1）自定义一个类，继承自Module类：构造函数__init__；逻辑运算函数，即所谓的前向计算函数forward函数。

（2）在构造函数_init__中实现层的参数定义，如Conv2d层的in_channels, out_channels等

（3）在前向传播forward函数里面实现前向运算，只要在nn.Module的子类中定义了forward()函数，backward()函数就会被自动实现。

注意：1.构造函数主要用于定义参数，forward函数表示层的前向传播逻辑。

2.注意forward()函数定义的是批数据的前向传播逻辑。

3.一般情况下，我们定义的参数是可以求导的，但是自定义操作如不可导，需要实现backward函数。

总结：这里其实和定义一个自定义模型是一样的，核心都是实现最基本的构造函数__init__和前向运算函数forward函数

二、自定义层的简单例子

一个简单的层实现功能y=w*sqrt(x^2+bias),

2.1.1 定义一个自定义层MyLayer

# 定义一个 my_layer.py
import torch
 
class MyLayer(torch.nn.Module):
    '''
    因为这个层实现的功能是：y=weights*sqrt(x2+bias),所以有两个参数：
    权值矩阵weights
    偏置矩阵bias
    输入 x 的维度是（in_features,)
    输出 y 的维度是（out_features,) 故而
    bias 的维度是（in_fearures,)，注意这里为什么是in_features,而不是out_features，注意体会这里和Linear层的区别所在
    weights 的维度是（in_features, out_features）注意这里为什么是（in_features, out_features）,而不是（out_features, in_features），注意体会这里和Linear层的区别所在
    '''
    def __init__(self, in_features, out_features, bias=True):
        super(MyLayer, self).__init__()  # 和自定义模型一样，第一句话就是调用父类的构造函数
        self.in_features = in_features
        self.out_features = out_features
        self.weight = torch.nn.Parameter(torch.Tensor(in_features, out_features)) # 由于weights是可以训练的，所以使用Parameter来定义
        if bias:
            self.bias = torch.nn.Parameter(torch.Tensor(in_features))             # 由于bias是可以训练的，所以使用Parameter来定义
        else:
            self.register_parameter('bias', None)
 
    def forward(self, input):
        input_=torch.pow(input,2)+self.bias
        y=torch.matmul(input_,self.weight)
        return y

2.1.2 自定义模型并且训练

import torch
from my_layer import MyLayer # 自定义层
 
 
N, D_in, D_out = 10, 5, 3  # 一共10组样本，输入特征为5，输出特征为3 
 
# 先定义一个模型
class MyNet(torch.nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()  # 第一句话，调用父类的构造函数
        self.mylayer1 = MyLayer(D_in,D_out)
 
    def forward(self, x):
        x = self.mylayer1(x)
 
        return x
 
model = MyNet()
print(model)
'''运行结果为：
MyNet(
  (mylayer1): MyLayer()   # 这就是自己定义的一个层
)
'''

注意：此时，将没有训练参数的层没有放在构造函数里面了，所以这些层就不会出现在model里面，但是运行关系是在forward里面通过functional的方法实现的。

总结：所有放在构造函数__init__里面的层的都是这个模型的“固有属性”.

2.1.3 下面开始训练

# 创建输入、输出数据
x = torch.randn(N, D_in)  #（10，5）
y = torch.randn(N, D_out) #（10，3）
 
 
#定义损失函数
loss_fn = torch.nn.MSELoss(reduction='sum')
 
learning_rate = 1e-4
#构造一个optimizer对象
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
 
for t in range(10): # 
    
    # 第一步：数据的前向传播，计算预测值p_pred
    y_pred = model(x)
 
    # 第二步：计算计算预测值p_pred与真实值的误差
    loss = loss_fn(y_pred, y)
    print(f"第 {t} 个epoch, 损失是 {loss.item()}")
 
    # 在反向传播之前，将模型的梯度归零，这
    optimizer.zero_grad()
 
    # 第三步：反向传播误差
    loss.backward()
 
    # 直接通过梯度一步到位，更新完整个网络的训练参数
    optimizer.step()

2.2 再看一个例子。用自定义层实现如下功能：输入为两个 $N$ 维向量 $x_{1_{}}$ 和 $x_{2}$ ，参数为一个 $N\times N$ 的矩阵 $M$ ，输出为 $label=sigmoid(x_{1}\times M\times x_{2})$ 。

#-*-coding:utf-8-*-
"""
@author:taoshouzheng
@time:2019/7/1 10:37
@email:tsz1216@sina.com
"""
 
import torch
import torch.nn as nn
import numpy as np
from torch.autograd import Variable
import math
from torch import optim
import torch.utils.data as Data
 
 
# 定义DisMult层
class DisMult(nn.Module):
	def __init__(self, emb_size):
 
		# nn.Module子类的函数必须在构造函数中执行父类的构造函数
		# 下式等价于nn.Module.__init__(self)
		super(DisMult, self).__init__()
		# 隐特征维度
		self.emb_size = emb_size
		# 关系特定的方阵
		# self.weights = nn.Parameter(torch.Tensor(emb_size, emb_size), requires_grad=requires_grad)
		self.weights = nn.Parameter(torch.Tensor(emb_size, emb_size))
		# 初始化参数
		self.reset_parameters()
 
	# 初始化参数
	def reset_parameters(self):
		stdv = 1. / math.sqrt(self.weights.size(0))
		self.weights.data.uniform_(-stdv, stdv)
 
	# 前向传播函数：注意这里定义的是批的处理逻辑
	def forward(self, input1, input2):
		result = []
		for i in range(input1.size(0)):
			result_i = input1[i] @ self.weights @ (input2[i].t())
			result.append(result_i)
 
		return torch.sigmoid(torch.DoubleTensor(result))
 
 
# 定义DisMult模型
class DisMultModel(nn.Module):
	def __init__(self, hidden_dim):
		super(DisMultModel, self).__init__()
		self.hidden_dim = hidden_dim
		# 模型中加入一层DisMult层
		self.dismult = DisMult(hidden_dim)
 
	def forward(self, x_a, x_b):
		return self.dismult(x_a, x_b)
 
 
# 定义网络
model = DisMultModel(hidden_dim=50)
 
# 打印初始化参数
for item in model.named_parameters():
	print(item)
 
# 定义损失函数和优化器
criterion = nn.BCELoss(reduction='sum')		# 二分类交叉熵损失函数
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)		# 优化器
 
# 训练网络:输入数据、前向传播 + 反向传播、更新参数
 
# 超参数
EPOCH = 100		# 迭代轮数
BATCH_SIZE = 100		# 批大小
 
 
input_a = []
input_b = []
label = []
 
with open(r'E:\Experiment\数据预处理\仿真数据\relation_1_train.csv', 'r', encoding='utf8') as f:
	contents = f.readlines()
	for content in contents:
		content = content.strip().split('\t')
		# inputs.append([content[0].strip().split(' '), content[1].strip().split(' ')])
		input_a.append(content[0].strip().split(' '))
		input_b.append(content[1].strip().split(' '))
		label.append(content[2])
 
# numpy数组
inputs = np.array([input_a, input_b], dtype=np.float)		# 实数类型
 
label = np.array(label, dtype=np.long)
 
# Tensor
inputs = torch.FloatTensor(inputs)
 
label = torch.DoubleTensor(label)
 
# 先转化成torch能识别的Dataset
torch_dataset = Data.TensorDataset(inputs[0], inputs[1], label)
 
# dataset放到DataLoader中
loader = Data.DataLoader(dataset=torch_dataset, batch_size=BATCH_SIZE, shuffle=True)
 
# 训练
for epoch in range(EPOCH):
 
	# 每一步loader释放一小批数据用于训练
	for step, (batch_inputs_1, batch_inputs_2, batch_label) in enumerate(loader):
 
		# 输入数据
		input_1 = Variable(batch_inputs_1)
		input_2 = Variable(batch_inputs_2)
		label = Variable(batch_label)
 
		# 梯度清零
		optimizer.zero_grad()
 
		# 前向传播 + 反向传播
		output = model(input_1, input_2)		# 计算输出
 
		loss = criterion(output, label)		# 计算损失函数
		loss.requires_grad = True
 
		print('Epoch', epoch + 1,  'Step', step + 1, ':', loss.item())
 
		loss.backward()		# 反向传播
 
		# 更新参数
		optimizer.step()
 
# 打印训练后的参数
for item in model.named_parameters():
	print(item)

总结：

PyTorch中的Layer和Module没有明显的区别，不管是自定义层、自定义块，还是自定义模型，都是通过继承Module类完成的，PyTorch中的一切自定义操作基本上都是通过继承nn.Module类来实现的。

但是要特别注意：Sequential类虽然继承自Module类，二者有相似部分，但是也有很多不同的部分，集中体现在：

Sequenrial类实现了整数索引，故而可以使用model[index] 这样的方式获取一个层，但是Module类并没有实现整数索引，不能够通过整数索引来获得层，那该怎么办呢？它提供了几个主要的方法，如下：

def children(self):
 
def named_children(self):
 
def modules(self):
 
def named_modules(self, memo=None, prefix=''):
 
'''
注意：这几个方法返回的都是一个Iterator迭代器，故而通过for循环访问，当然也可以通过next
'''

参考：https://blog.csdn.net/qq_27825451/article/details/90550890
原文链接：https://blog.csdn.net/tszupup/article/details/95452976
原文链接：https://blog.csdn.net/qq_27825451/article/details/90705328

jk英菲尼迪

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
nn.Sequential，torch.nn.Module和torch.autograd.Function

参考自：https://blog.csdn.net/qq_27825451/category_8866856.html对于简单模型，我们可以使用torch.nn.Sequential类来实现简单的顺序连接模型，这个模型也是继承自Module类的。https://blog.csdn.net/qq_27825451/article/details/90551513...
复制链接

扫一扫