参考资料:https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
Pytorch:nn
pytorch的nn包中定义了一系列的Modules, 大致相当于神经网络层,Module接收输入张量以及计算输出张量,内部状态也可以是可学习的张量形式。
import torch
N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out)
)
loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-4
for t in range(500):
y_pred = model(x)
loss = loss_fn(y, y_pred)
print(t, loss.item())
# 在反向传播之前,zero the gradients
model.zero_grad()
loss.backward()
with torch.no_grad():
for param in model.parameters():
param -= learning_rate * param.grad
Pytorch: Optim
我们通过手动更改张量(torch.no_grad())来更新模型的参数,对于一些简单的优化算法,例如随机梯度下降,来讲可能不算复杂,但是在神经网络训练的时候,往往使用的是更复杂的优化算法,如AdaGrad, RMSProp, Adam等等,optim包抽象出一些常用的优化算法:
import torch
N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out)
)
loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for t in range(500):
y_pred = model(x)
loss = loss_fn(y_pred, y)
print(t, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
Pytorch: Custom nn Modules
有些时候比起一些特定存在的Modules,我们可能更想要自己制定特定的Module, 可以使用nn.Module 以及定义一个forward函数来接收输入的张量和输出的张量。
import torch
class TwoLayerNet(torch.nn.Module):
def __init__(self, D_in, H, D_out):
super(TwoLayerNet, self).__init__()
self.linear1 = torch.nn.Linear(D_in, H)
self.linear2 = torch.nn.Linear(H, D_out)
def forward(self, x):
h_relu = self.linear1(x).clamp(min=0)
y_pred = self.linear2(h_relu)
return y_pred
N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
model = TwoLayerNet(D_in, H, D_out)
# 构造损失函数
criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
for t in range(500):
y_pred = model(x)
# 计算损失并打印
loss = criterion(y_pred, y)
print(t, loss.item())
# zero gradients, 反向传播, 更新权重
optimizer.zero_grad()
loss.backward()
optimizer.step()
Pytorch: Control Flow + Weight Sharing
class DynamicNet(torch.nn.Module):
def __init__(self, D_in, H, D_out):
super(DynamicNet, self).__init__()
self.input_linear = torch.nn.Linear(D_in, H)
self.middle_linear = torch.nn.Linear(H, H)
self.output_linear = torch.nn.Linear(H, D_out)
def forward(self, x):
# 复用middle_linear Module多次,以计算中间层的表示
# 因为每一个前向过程都会构建一个动态计算图,我们可以在定义模型的前向过程的时候
# 使用 python control-flow 操作,如循环或条件语句
h_relu = self.input_linear(x).clamp(min=0)
for _ in range(random.randint(0, 3)):
h_relu = self.middle_linear(h_relu).clamp(min=0)
y_pred = self.output_linear(h_relu)
return y_pred