BUILD THE NEURAL NETWORK
The torch.nn namespace provides all the building blocks you need to build your own neural network.
Every module in PyTorch subclasses the nn.Module
A neural network is a module itself that consists of other modules (layers)
A neural network is a module itself that consists of other modules (layers)
Get Device for Training
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using {device} device")
Define the Class
We define our neural network by subclassing nn.Module
, and initialize the neural network layers in__init__
每一个对输入数据的操作都是在forward
方法中实现的:
class NeuralNetwork(nn.Module):
def __init_(self):
super(NeuralNetwork,self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28,512),
nn.ReLU(),
nn.Linear(512,512),
nn.ReLU(),
nn.Linear(512,10),
)
def forward(self,x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
滚犊子玩意,这不就是之前quickstart中的代码吗?搁着我整半天就是重新写啊?
We create an instance of NeuralNetwork, and move it to the device, and print its structure.
model = NeuralNetwork().to(device)
print(model)
Calling the model on the input returns a 10-dimensional tensor with raw predicted values for each class. We get the prediction probabilities by passing it through an instance of the nn.Softmax module.
X = torch.rand(1, 28, 28, device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")
Model Layers
Let’s break down the layers in the FashionMNIST model. To illustrate it, we will take a sample minibatch of 3 images of size 28x28 and see what happens to it as we pass it through the network.
input_image = torch.rand(3,28,28)
nn.Flatten
We initialize thenn.Flatten
layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values ( the minibatch dimension (at dim=0) is maintained)
flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())
out:
torch.Size([3, 784])
nn.Linear
The linear layer is a module that applies a linear transformation on the input using its stored weights and biases.
所以nn.Linear
只一个存储权重和偏差的线性层。
layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size())
out:
torch.Size([3, 20])
nn.ReLU
In this model, we use nn.ReLU between our linear layers, but there’s other activations to introduce non-linearity in your model.
print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")
nn.Sequential
nn.Sequential
是一个modules的有序容器。输入的数据按照之前模块的安放顺序通过所有模块
You can use sequential containers to put together a quick network like seq_modules.
seq_modules = nn.Sequential(
flatten,
layer1,
nn.ReLU(),
nn.Linear(20,10),
)
input_image = torch.rand(3,28,28)
logits = seq_modules(input_image)
就是可以把之前单独创建的module放入Sequential中。
nn.Softmax
The logits are scaled to values [0, 1] representing the model’s predicted probabilities for each class.
dim
parameter indicates the dimension along which the values must sum to 1.
softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)
Model Parameters
All parameters are accessible by using your model’s parameters()
ornamed_parameters()
methods.
print("Model structure: ",model,"\n\n")
for name, param in model.named_parameters():
print(f"Layer:{name} | Size: {param.size()} | Values: {param[:2]}\n")