[Pytorch][转载]用pytorch实现两层神经网络

PyTorch: Tensors

这次我们使用PyTorch tensors来创建前向神经网络,计算损失,以及反向传播。

 

一个PyTorch Tensor很像一个numpy的ndarray。但是它和numpy ndarray最大的区别是,PyTorch Tensor可以在CPU或者GPU上运算。如果想要在GPU上运算,就需要把Tensor换成cuda类型。
 import torch

 

 

dtype = torch.float

device = torch.device("cpu")

# device = torch.device("cuda:0") # Uncomment this to run on GPU

 

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

 

# Create random input and output data

x = torch.randn(N, D_in, device=device, dtype=dtype)

y = torch.randn(N, D_out, device=device, dtype=dtype)

 

# Randomly initialize weights

w1 = torch.randn(D_in, H, device=device, dtype=dtype)

w2 = torch.randn(H, D_out, device=device, dtype=dtype)

 

learning_rate = 1e-6

for t in range(500):

    # Forward pass: compute predicted y

    h = x.mm(w1)

    h_relu = h.clamp(min=0)

    y_pred = h_relu.mm(w2)

 

    # Compute and print loss

    loss = (y_pred - y).pow(2).sum().item()

    print(t, loss)

 

    # Backprop to compute gradients of w1 and w2 with respect to loss

    grad_y_pred = 2.0 * (y_pred - y)

    grad_w2 = h_relu.t().mm(grad_y_pred)

    grad_h_relu = grad_y_pred.mm(w2.t())

    grad_h = grad_h_relu.clone()

    grad_h[h < 0] = 0

    grad_w1 = x.t().mm(grad_h)

 

    # Update weights using gradient descent

    w1 -= learning_rate * grad_w1

    w2 -= learning_rate * grad_w2

 简单的autograd
 # Create tensors.

x = torch.tensor(1., requires_grad=True)

w = torch.tensor(2., requires_grad=True)

b = torch.tensor(3., requires_grad=True)

 

# Build a computational graph.

y = w * x + b    # y = 2 * x + 3

 

# Compute gradients.

y.backward()

 

# Print out the gradients.

print(x.grad)    # x.grad = 2 

print(w.grad)    # w.grad = 1 

print(b.grad)    # b.grad = 1 
 

# Create tensors.

x = torch.tensor(1., requires_grad=True)

w = torch.tensor(2., requires_grad=True)

b = torch.tensor(3., requires_grad=True)

 

# Build a computational graph.

y = w * x + b    # y = 2 * x + 3

 

# Compute gradients.

y.backward()

 

# Print out the gradients.

print(x.grad)    # x.grad = 2 

print(w.grad)    # w.grad = 1 

print(b.grad)    # b.grad = 1 
 PyTorch: Tensor和autograd

PyTorch的一个重要功能就是autograd,也就是说只要定义了forward pass(前向神经网络),计算了loss之后,PyTorch可以自动求导计算模型所有参数的梯度。

 

一个PyTorch的Tensor表示计算图中的一个节点。如果x是一个Tensor并且x.requires_grad=True那么x.grad是另一个储存着x当前梯度(相对于一个scalar,常常是loss)的向量。
 import torch

 

dtype = torch.float

device = torch.device("cpu")

# device = torch.device("cuda:0") # Uncomment this to run on GPU

 

# N 是 batch size; D_in 是 input dimension;

# H 是 hidden dimension; D_out 是 output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

 

# 创建随机的Tensor来保存输入和输出

# 设定requires_grad=False表示在反向传播的时候我们不需要计算gradient

x = torch.randn(N, D_in, device=device, dtype=dtype)

y = torch.randn(N, D_out, device=device, dtype=dtype)

 

# 创建随机的Tensor和权重。

# 设置requires_grad=True表示我们希望反向传播的时候计算Tensor的gradient

w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)

w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

 

learning_rate = 1e-6

for t in range(500):

    # 前向传播:通过Tensor预测y;这个和普通的神经网络的前向传播没有任何不同,

    # 但是我们不需要保存网络的中间运算结果,因为我们不需要手动计算反向传播。

    y_pred = x.mm(w1).clamp(min=0).mm(w2)

 

    # 通过前向传播计算loss

    # loss是一个形状为(1,)的Tensor

    # loss.item()可以给我们返回一个loss的scalar

    loss = (y_pred - y).pow(2).sum()

    print(t, loss.item())

 

    # PyTorch给我们提供了autograd的方法做反向传播。如果一个Tensor的requires_grad=True,

    # backward会自动计算loss相对于每个Tensor的gradient。在backward之后,

    # w1.grad和w2.grad会包含两个loss相对于两个Tensor的gradient信息。

    loss.backward()

 

    # 我们可以手动做gradient descent(后面我们会介绍自动的方法)。

    # 用torch.no_grad()包含以下statements,因为w1和w2都是requires_grad=True,

    # 但是在更新weights之后我们并不需要再做autograd。

    # 另一种方法是在weight.data和weight.grad.data上做操作,这样就不会对grad产生影响。

    # tensor.data会我们一个tensor,这个tensor和原来的tensor指向相同的内存空间,

    # 但是不会记录计算图的历史。

    with torch.no_grad():

        w1 -= learning_rate * w1.grad

        w2 -= learning_rate * w2.grad

 

        # Manually zero the gradients after updating weights

        w1.grad.zero_()

        w2.grad.zero_()
 PyTorch: nn

这次我们使用PyTorch中nn这个库来构建网络。 用PyTorch autograd来构建计算图和计算gradients, 然后PyTorch会帮我们自动计算gradient。
 import torch

 

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

 

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

 

# Use the nn package to define our model as a sequence of layers. nn.Sequential

# is a Module which contains other Modules, and applies them in sequence to

# produce its output. Each Linear Module computes output from input using a

# linear function, and holds internal Tensors for its weight and bias.

model = torch.nn.Sequential(

    torch.nn.Linear(D_in, H),

    torch.nn.ReLU(),

    torch.nn.Linear(H, D_out),

)

 

# The nn package also contains definitions of popular loss functions; in this

# case we will use Mean Squared Error (MSE) as our loss function.

loss_fn = torch.nn.MSELoss(reduction='sum')

 

learning_rate = 1e-4

for t in range(500):

    # Forward pass: compute predicted y by passing x to the model. Module objects

    # override the __call__ operator so you can call them like functions. When

    # doing so you pass a Tensor of input data to the Module and it produces

    # a Tensor of output data.

    y_pred = model(x)

 

    # Compute and print loss. We pass Tensors containing the predicted and true

    # values of y, and the loss function returns a Tensor containing the

    # loss.

    loss = loss_fn(y_pred, y)

    print(t, loss.item())

 

    # Zero the gradients before running the backward pass.

    model.zero_grad()

 

    # Backward pass: compute gradient of the loss with respect to all the learnable

    # parameters of the model. Internally, the parameters of each Module are stored

    # in Tensors with requires_grad=True, so this call will compute gradients for

    # all learnable parameters in the model.

    loss.backward()

 

    # Update the weights using gradient descent. Each parameter is a Tensor, so

    # we can access its gradients like we did before.

    with torch.no_grad():

        for param in model.parameters():

            param -= learning_rate * param.grad
 

PyTorch: optim

这一次我们不再手动更新模型的weights,而是使用optim这个包来帮助我们更新参数。 optim这个package提供了各种不同的模型优化方法,包括SGD+momentum, RMSProp, Adam等等。
 

import torch

 

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

 

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

 

# Use the nn package to define our model and loss function.

model = torch.nn.Sequential(

    torch.nn.Linear(D_in, H),

    torch.nn.ReLU(),

    torch.nn.Linear(H, D_out),

)

loss_fn = torch.nn.MSELoss(reduction='sum')

 

# Use the optim package to define an Optimizer that will update the weights of

# the model for us. Here we will use Adam; the optim package contains many other

# optimization algoriths. The first argument to the Adam constructor tells the

# optimizer which Tensors it should update.

learning_rate = 1e-4

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for t in range(500):

    # Forward pass: compute predicted y by passing x to the model.

    y_pred = model(x)

 

    # Compute and print loss.

    loss = loss_fn(y_pred, y)

    print(t, loss.item())

 

    # Before the backward pass, use the optimizer object to zero all of the

    # gradients for the variables it will update (which are the learnable

    # weights of the model). This is because by default, gradients are

    # accumulated in buffers( i.e, not overwritten) whenever .backward()

    # is called. Checkout docs of torch.autograd.backward for more details.

    optimizer.zero_grad()

 

    # Backward pass: compute gradient of the loss with respect to model

    # parameters

    loss.backward()

 

    # Calling the step function on an Optimizer makes an update to its

    # parameters

    optimizer.step()

 PyTorch: 自定义 nn Modules

我们可以定义一个模型,这个模型继承自nn.Module类。如果需要定义一个比Sequential模型更加复杂的模型,就需要定义nn.Module模型。
 

 import torch

 

 

class TwoLayerNet(torch.nn.Module):

    def __init__(self, D_in, H, D_out):

        """

        In the constructor we instantiate two nn.Linear modules and assign them as

        member variables.

        """

        super(TwoLayerNet, self).__init__()

        self.linear1 = torch.nn.Linear(D_in, H)

        self.linear2 = torch.nn.Linear(H, D_out)

 

    def forward(self, x):

        """

        In the forward function we accept a Tensor of input data and we must return

        a Tensor of output data. We can use Modules defined in the constructor as

        well as arbitrary operators on Tensors.

        """

        h_relu = self.linear1(x).clamp(min=0)

        y_pred = self.linear2(h_relu)

        return y_pred

 

 

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

 

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

 

# Construct our model by instantiating the class defined above

model = TwoLayerNet(D_in, H, D_out)

 

# Construct our loss function and an Optimizer. The call to model.parameters()

# in the SGD constructor will contain the learnable parameters of the two

# nn.Linear modules which are members of the model.

criterion = torch.nn.MSELoss(reduction='sum')

optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

for t in range(500):

    # Forward pass: Compute predicted y by passing x to the model

    y_pred = model(x)

 

    # Compute and print loss

    loss = criterion(y_pred, y)

    print(t, loss.item())

 

    # Zero gradients, perform a backward pass, and update the weights.

    optimizer.zero_grad()

    loss.backward()

    optimizer.step()

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

码农张三疯

你的打赏是我写文章最大的动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值