[Pytorch][转载]用pytorch实现两层神经网络

最新推荐文章于 2022-10-25 10:06:43 发布

码农张三疯

最新推荐文章于 2022-10-25 10:06:43 发布

阅读量402

点赞数 1

分类专栏： Pytorch

本文链接：https://blog.csdn.net/FL1623863129/article/details/105713018

版权

Pytorch 专栏收录该内容

15 篇文章 1 订阅

订阅专栏

PyTorch: Tensors

这次我们使用PyTorch tensors来创建前向神经网络，计算损失，以及反向传播。

一个PyTorch Tensor很像一个numpy的ndarray。但是它和numpy ndarray最大的区别是，PyTorch Tensor可以在CPU或者GPU上运算。如果想要在GPU上运算，就需要把Tensor换成cuda类型。
import torch

dtype = torch.float

device = torch.device("cpu")

# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data

x = torch.randn(N, D_in, device=device, dtype=dtype)

y = torch.randn(N, D_out, device=device, dtype=dtype)

# Randomly initialize weights

w1 = torch.randn(D_in, H, device=device, dtype=dtype)

w2 = torch.randn(H, D_out, device=device, dtype=dtype)

learning_rate = 1e-6

for t in range(500):

# Forward pass: compute predicted y

h = x.mm(w1)

h_relu = h.clamp(min=0)

y_pred = h_relu.mm(w2)

# Compute and print loss

loss = (y_pred - y).pow(2).sum().item()

print(t, loss)

# Backprop to compute gradients of w1 and w2 with respect to loss

grad_y_pred = 2.0 * (y_pred - y)

grad_w2 = h_relu.t().mm(grad_y_pred)

grad_h_relu = grad_y_pred.mm(w2.t())

grad_h = grad_h_relu.clone()

grad_h[h < 0] = 0

grad_w1 = x.t().mm(grad_h)

# Update weights using gradient descent

w1 -= learning_rate * grad_w1

w2 -= learning_rate * grad_w2

简单的autograd
# Create tensors.

x = torch.tensor(1., requires_grad=True)

w = torch.tensor(2., requires_grad=True)

b = torch.tensor(3., requires_grad=True)

# Build a computational graph.

y = w * x + b # y = 2 * x + 3

# Compute gradients.

y.backward()

# Print out the gradients.

print(x.grad) # x.grad = 2

print(w.grad) # w.grad = 1

print(b.grad) # b.grad = 1

# Create tensors.

x = torch.tensor(1., requires_grad=True)

w = torch.tensor(2., requires_grad=True)

b = torch.tensor(3., requires_grad=True)

# Build a computational graph.

y = w * x + b # y = 2 * x + 3

# Compute gradients.

y.backward()

# Print out the gradients.

print(x.grad) # x.grad = 2

print(w.grad) # w.grad = 1

print(b.grad) # b.grad = 1
PyTorch: Tensor和autograd

PyTorch的一个重要功能就是autograd，也就是说只要定义了forward pass(前向神经网络)，计算了loss之后，PyTorch可以自动求导计算模型所有参数的梯度。

一个PyTorch的Tensor表示计算图中的一个节点。如果x是一个Tensor并且x.requires_grad=True那么x.grad是另一个储存着x当前梯度(相对于一个scalar，常常是loss)的向量。
import torch

dtype = torch.float

device = torch.device("cpu")

# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N 是 batch size; D_in 是 input dimension;

# H 是 hidden dimension; D_out 是 output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# 创建随机的Tensor来保存输入和输出

# 设定requires_grad=False表示在反向传播的时候我们不需要计算gradient

x = torch.randn(N, D_in, device=device, dtype=dtype)

y = torch.randn(N, D_out, device=device, dtype=dtype)

# 创建随机的Tensor和权重。

# 设置requires_grad=True表示我们希望反向传播的时候计算Tensor的gradient

w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)

w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

learning_rate = 1e-6

for t in range(500):

# 前向传播:通过Tensor预测y；这个和普通的神经网络的前向传播没有任何不同，

# 但是我们不需要保存网络的中间运算结果，因为我们不需要手动计算反向传播。

y_pred = x.mm(w1).clamp(min=0).mm(w2)

# 通过前向传播计算loss

# loss是一个形状为(1，)的Tensor

# loss.item()可以给我们返回一个loss的scalar

loss = (y_pred - y).pow(2).sum()

print(t, loss.item())

# PyTorch给我们提供了autograd的方法做反向传播。如果一个Tensor的requires_grad=True，

# backward会自动计算loss相对于每个Tensor的gradient。在backward之后，

# w1.grad和w2.grad会包含两个loss相对于两个Tensor的gradient信息。

loss.backward()

# 我们可以手动做gradient descent(后面我们会介绍自动的方法)。

# 用torch.no_grad()包含以下statements，因为w1和w2都是requires_grad=True，

# 但是在更新weights之后我们并不需要再做autograd。

# 另一种方法是在weight.data和weight.grad.data上做操作，这样就不会对grad产生影响。

# tensor.data会我们一个tensor，这个tensor和原来的tensor指向相同的内存空间，

# 但是不会记录计算图的历史。

with torch.no_grad():

w1 -= learning_rate * w1.grad

w2 -= learning_rate * w2.grad

# Manually zero the gradients after updating weights

w1.grad.zero_()

w2.grad.zero_()
PyTorch: nn

这次我们使用PyTorch中nn这个库来构建网络。用PyTorch autograd来构建计算图和计算gradients，然后PyTorch会帮我们自动计算gradient。
import torch

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

# Use the nn package to define our model as a sequence of layers. nn.Sequential

# is a Module which contains other Modules, and applies them in sequence to

# produce its output. Each Linear Module computes output from input using a

# linear function, and holds internal Tensors for its weight and bias.

model = torch.nn.Sequential(

torch.nn.Linear(D_in, H),

torch.nn.ReLU(),

torch.nn.Linear(H, D_out),

)

# The nn package also contains definitions of popular loss functions; in this

# case we will use Mean Squared Error (MSE) as our loss function.

loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4

for t in range(500):

# Forward pass: compute predicted y by passing x to the model. Module objects

# override the __call__ operator so you can call them like functions. When

# doing so you pass a Tensor of input data to the Module and it produces

# a Tensor of output data.

y_pred = model(x)

# Compute and print loss. We pass Tensors containing the predicted and true

# values of y, and the loss function returns a Tensor containing the

# loss.

loss = loss_fn(y_pred, y)

print(t, loss.item())

# Zero the gradients before running the backward pass.

model.zero_grad()

# Backward pass: compute gradient of the loss with respect to all the learnable

# parameters of the model. Internally, the parameters of each Module are stored

# in Tensors with requires_grad=True, so this call will compute gradients for

# all learnable parameters in the model.

loss.backward()

# Update the weights using gradient descent. Each parameter is a Tensor, so

# we can access its gradients like we did before.

with torch.no_grad():

for param in model.parameters():

param -= learning_rate * param.grad

PyTorch: optim

这一次我们不再手动更新模型的weights,而是使用optim这个包来帮助我们更新参数。 optim这个package提供了各种不同的模型优化方法，包括SGD+momentum, RMSProp, Adam等等。

import torch

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

# Use the nn package to define our model and loss function.

model = torch.nn.Sequential(

torch.nn.Linear(D_in, H),

torch.nn.ReLU(),

torch.nn.Linear(H, D_out),

)

loss_fn = torch.nn.MSELoss(reduction='sum')

# Use the optim package to define an Optimizer that will update the weights of

# the model for us. Here we will use Adam; the optim package contains many other

# optimization algoriths. The first argument to the Adam constructor tells the

# optimizer which Tensors it should update.

learning_rate = 1e-4

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for t in range(500):

# Forward pass: compute predicted y by passing x to the model.

y_pred = model(x)

# Compute and print loss.

loss = loss_fn(y_pred, y)

print(t, loss.item())

# Before the backward pass, use the optimizer object to zero all of the

# gradients for the variables it will update (which are the learnable

# weights of the model). This is because by default, gradients are

# accumulated in buffers( i.e, not overwritten) whenever .backward()

# is called. Checkout docs of torch.autograd.backward for more details.

optimizer.zero_grad()

# Backward pass: compute gradient of the loss with respect to model

# parameters

loss.backward()

# Calling the step function on an Optimizer makes an update to its

# parameters

optimizer.step()

PyTorch: 自定义 nn Modules

我们可以定义一个模型，这个模型继承自nn.Module类。如果需要定义一个比Sequential模型更加复杂的模型，就需要定义nn.Module模型。

import torch

class TwoLayerNet(torch.nn.Module):

def __init__(self, D_in, H, D_out):

"""

In the constructor we instantiate two nn.Linear modules and assign them as

member variables.

"""

super(TwoLayerNet, self).__init__()

self.linear1 = torch.nn.Linear(D_in, H)

self.linear2 = torch.nn.Linear(H, D_out)

def forward(self, x):

"""

In the forward function we accept a Tensor of input data and we must return

a Tensor of output data. We can use Modules defined in the constructor as

well as arbitrary operators on Tensors.

"""

h_relu = self.linear1(x).clamp(min=0)

y_pred = self.linear2(h_relu)

return y_pred

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

# Construct our model by instantiating the class defined above

model = TwoLayerNet(D_in, H, D_out)

# Construct our loss function and an Optimizer. The call to model.parameters()

# in the SGD constructor will contain the learnable parameters of the two

# nn.Linear modules which are members of the model.

criterion = torch.nn.MSELoss(reduction='sum')

optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

for t in range(500):

# Forward pass: Compute predicted y by passing x to the model

y_pred = model(x)

# Compute and print loss

loss = criterion(y_pred, y)

print(t, loss.item())

# Zero gradients, perform a backward pass, and update the weights.

optimizer.zero_grad()

loss.backward()

optimizer.step()

码农张三疯

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录