Basic Networks
x = torch.tensor([[6,2],[5,2],[1,3],[7,6]]).float()
y = torch.tensor([1,5,2,5]).float()
We want to find a function that depends on parameters that lets us get from x to y.
M1 = nn.Linear(2,8,bias=False)
M2 = nn.Linear(8,1, bias=False)
#and we put x into that
M2(M1(x)).squeeze()
In order to optimize for these weights, we first construct our network as follows:
class MyNeuralNet(nn.Module):
def __init__(self):
super().__init__()
self.Matrix1 = nn.Linear(2,8,bias=False)
self.Matrix2 = nn.Linear(8,1,bias=False)
def forward(self,x):
x = self.Matrix1(x)
x = self.Matrix2(x)
return x.squeeze()
f = MyNeuralNet()
yhat = f(x)
Adjusting a so that yhat and y are similar:
L = nn.MSELoss()
L(y,yhat)
gradient descent:The idea is to do this over and over again, until one reaches a minimum for L
Each pass of the full data set x is called an epoch.
opt = SGD(f.parameters(), lr=0.001)
losses = []
for _ in range(50):
opt.zero_grad() # flush previous epoch's gradient
loss_value = L(f(x), y) #compute loss
loss_value.backward() # compute gradient
opt.step() # Perform iteration using gradient above
losses.append(loss_value.item())
plt.plot(losses)
plt.ylabel('Loss $L(y,\hat{y};a)$')
plt.xlabel('Epochs')