pytorch学习笔记__神经网络训练

最新推荐文章于 2024-08-19 23:38:01 发布

crazyfatLisa

最新推荐文章于 2024-08-19 23:38:01 发布

阅读量417

点赞数

分类专栏： pytorch 文章标签： Python pytorch Deep learning

本文链接：https://blog.csdn.net/crazyfatLisa/article/details/83820049

版权

pytorch 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

pytorch学习笔记__神经网络训练

1、典型神经网络训练步骤

https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py

A typical training procedure for a neural network is as follows:

Define the neural network that has some learnable parameters (or weights)
Iterate over a dataset of inputs Process input through the network Compute the loss (how far is the output from being correct) Propagate
gradients back into the network’s parameters Update the weights of the
network, typically using a simple update rule: weight = weight - learning_rate * gradient

2、定义网络 DEFINE THE NETWORK

主要运用到“torch.nn”包来构建网络。



class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 卷积层 （输入通道数，输出通道数，卷积核） 
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # 全连接层 （输入节点数，输出节点数）
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # 最大池化
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # 激活函数处理
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

net = Net()
print(net)

#只需要定义“forward”函数；“backward”函数（计算导数）使用“autograd”自动定义
#在“forward”函数中可以使用任何的Tensor运算操作。
#模型的可学习参数由“net.parameters()”返回

# 置零所有参数的梯度缓冲区，然后用随机梯度开始反向传播：
net.zero_grad()
out.backward(torch.randn(1, 10))

原链接还有很多细枝末节的解释，此处不再赘述。
https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html
这个博主翻译的挺详细的
http://blog.sina.com.cn/s/blog_a99f842a0102y1e4.html

3、损失函数 Loss Function

output = net(input)
target = torch.randn(10)  # a dummy target, for example
target = target.view(1, -1)  # make it the same shape as output
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)
#  A simple loss is: nn.MSELoss which computes the mean-squared error between the input and the target

4、反向传播 Backprop

反向传播误差就是一句：loss.backward()

首先需要清空现在的梯度，否则将来的梯度将被累积到现有的梯度中。
2.然后调用loss.backward()，并且在反向的前、后，看一看conv1的偏置梯度（conv1’s bias gradients）。（求偏导）

net.zero_grad()     # zeroes the gradient buffers of all parameters

print('conv1.bias.grad before backward')	#反向传播前的梯度
print(net.conv1.bias.grad)		# conv1的偏导数	

loss.backward()

print('conv1.bias.grad after backward')		#反向传播后的梯度
print(net.conv1.bias.grad)

5、更新网络的权重 Update the weights （优化器）

常用的更新规则是随机梯度下降 SGD：权重 = 权重 - 学习率 * 梯度。
pytorch的“torch.optim”包可以实现各种不同的更新规则，例如SGD、Nesterov-SGD、Adam、RMSProp等等。

使用“torch.optim”包：

import torch.optim as optim

#create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

#in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()    # 更新网络梯度