最近一直在看Pytorch深度学习框架,读了不少内容,感觉有些总结的挺好的,记录下来,以备后续查找复习。
然后是公众号《 机器学习与推荐系统》,里面有几篇Pytorch教程,挺不错的。一篇长文学懂 pytorch
再然后是B站,1.01.第一课 深度学习回顾与PyTorch简介
以及实战课程《PyTorch深度学习实践》完结合集
学习流程:
- 首先看官方文档,看看大概有哪些模块,有个印象即可。
- 多找找一些例程,看不懂没关系,主要是看看Pytorch代码有哪些共同特点,还是要心里有个印象。
- 跑通一些简单的小例程,跑通就行,不需要深度理解,增加学习信心。
- 看完跑完例程之后,再返回来读pytorch官方文档,总结一些常用的模块。
- 找一些专业的视频或者教程,进行分块深度理解学习。
- 修改别人的代码,网络结构,扩展练习。
- 自己编程,练习。
网上找了这么个小例程,运行一下,看看有没有结果:
import torch
import matplotlib.pyplot as plt
dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random Tensors to hold input and outputs.
# Setting requires_grad=False indicates that we do not need to compute gradients
# with respect to these Tensors during the backward pass.
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)
# Create random Tensors for weights.
# Setting requires_grad=True indicates that we want to compute gradients with
# respect to these Tensors during the backward pass.
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)
learning_rate = 1e-6
for t in range(400):
# Forward pass: compute predicted y using operations on Tensors; these
# are exactly the same operations we used to compute the forward pass using
# Tensors, but we do not need to keep references to intermediate values since
# we are not implementing the backward pass by hand.
y_pred = x.mm(w1).clamp(min=0).mm(w2)
# Compute and print loss using operations on Tensors.
# Now loss is a Tensor of shape (1,)
# loss.item() gets the a scalar value held in the loss.
loss = (y_pred - y).pow(2).sum()
print(t, loss.item())
# Use autograd to compute the backward pass. This call will compute the
# gradient of loss with respect to all Tensors with requires_grad=True.
# After this call w1.grad and w2.grad will be Tensors holding the gradient
# of the loss with respect to w1 and w2 respectively.
loss.backward()
# Manually update weights using gradient descent. Wrap in torch.no_grad()
# because weights have requires_grad=True, but we don't need to track this
# in autograd.
# An alternative way is to operate on weight.data and weight.grad.data.
# Recall that tensor.data gives a tensor that shares the storage with
# tensor, but doesn't track history.
# You can also use torch.optim.SGD to achieve this.
with torch.no_grad():
w1 -= learning_rate * w1.grad
w2 -= learning_rate * w2.grad
# Manually zero the gradients after updating weights
w1.grad.zero_()
w2.grad.zero_()
plt.scatter(x.data.numpy()[:, 0], y.data.numpy()[:, 0], marker='o', s=100)
plt.scatter(x.data.numpy()[:, 0], y_pred.data.numpy()[:, 0], marker='^', s=50)
plt.show()
运行结果:●为真实值;▲为预测值。
这是一个超级简单的小例子,里面涉及的pytorch知识大概有:
torch.float、device = torch.device("cpu")、torch.randn、torch.no_grad()
等等等等,在官网文档上查询,理解这些什么含义,怎样用的即可。切不可急功近利~