一、影响训练结果的三大因素:
- 优化函数如Adam、SGD等;
- 初始化权重W随机初始化、服从正态分布的初始化等;
- 学习率一般1e-6/1e-4等。
二、有PyTorch手动到自动实现神经网络的构建
1、PyTorch: Tensors
手动创建前向神经网络,计算损失,以及反向传播
N, D_in, H, D_out = 64, 1000, 100, 10
# 随机创建一些训练数据
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
w1 = torch.randn(D_in, H)
w2 = torch.randn(H, D_out)
learning_rate = 1e-6
for it in range(500):
# Forward pass
h = x.mm(w1) # N * H
h_relu = h.clamp(min=0) # N * H
y_pred = h_relu.mm(w2) # N * D_out
# compute loss
loss = (y_pred - y).pow(2).sum().item()
print(it, loss)
# Backward pass
# compute the gradient
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.t().mm(grad_y_pred)
grad_h_relu = gr