pytorch学习笔记__神经网络训练
1、典型神经网络训练步骤
A typical training procedure for a neural network is as follows:
- Define the neural network that has some learnable parameters (or weights)
- Iterate over a dataset of inputs Process input through the network Compute the loss (how far is the output from being correct) Propagate
- gradients back into the network’s parameters Update the weights of the
- network, typically using a simple update rule: weight = weight - learning_rate * gradient
2、定义网络 DEFINE THE NETWORK
主要运用到“torch.nn”包来构建网络。
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 卷积层 (输入通道数,输出通道数,卷积核)
self.conv1 = nn.Conv2d(1, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
# 全连接层 (输入节点数,输出节点数)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
# 最大池化
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
# 激活函数处理
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def num_flat_features(self, x):
size = x.size()[1:]
num_features = 1
for s in size:
num_features *= s
return num_features
net = Net()
print(net)
#只需要定义“forward”函数;“backward”函数(计算导数)使用“autograd”自动定义
#在“forward”函数中可以使用任何的Tensor运算操作。
#模型的可学习参数由“net.parameters()”返回
# 置零所有参数的梯度缓冲区,然后用随机梯度开始反向传播:
net.zero_grad()
out.backward(torch.randn(1, 10))
原链接还有很多细枝末节的解释,此处不再赘述。
https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html
这个博主翻译的挺详细的
http://blog.sina.com.cn/s/blog_a99f842a0102y1e4.html
3、损失函数 Loss Function
output = net(input)
target = torch.randn(10) # a dummy target, for example
target = target.view(1, -1) # make it the same shape as output
criterion = nn.MSELoss()
loss = criterion(output, target)
print(loss)
# A simple loss is: nn.MSELoss which computes the mean-squared error between the input and the target
4、反向传播 Backprop
反向传播误差就是一句:loss.backward()
- 首先需要清空现在的梯度,否则将来的梯度将被累积到现有的梯度中。
2.然后调用loss.backward()
,并且在反向的前、后,看一看conv1的偏置梯度(conv1’s bias gradients)。(求偏导)
net.zero_grad() # zeroes the gradient buffers of all parameters
print('conv1.bias.grad before backward') #反向传播前的梯度
print(net.conv1.bias.grad) # conv1的偏导数
loss.backward()
print('conv1.bias.grad after backward') #反向传播后的梯度
print(net.conv1.bias.grad)
5、 更新网络的权重 Update the weights (优化器)
常用的更新规则是随机梯度下降 SGD:权重 = 权重 - 学习率 * 梯度。
pytorch的“torch.optim”
包可以实现各种不同的更新规则,例如SGD、Nesterov-SGD、Adam、RMSProp等等。
使用“torch.optim”
包:
import torch.optim as optim
#create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)
#in your training loop:
optimizer.zero_grad() # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step() # 更新网络梯度
6、 分类器的训练
pytorch官网举了个图片十分类例子。整个模型跑完精度并不高。将另外写一篇博文摸索提高精度的方法。