Pytorch tips:(From Berkeley)
A few things to watch out for variables:
-
You can’t do any in-place operations on a tensor that has
requires_grad=True
. (This prevents you from inadvertently mutating it in a way that isn’t tracked for backprop purposes.) -
You also can’t convert a tensor with
requires_grad=True
to numpy (for the same reason as above). Instead, you need to detach it first, e.g.y.detach().numpy()
. -
Even though
y.detach()
returns a new tensor, that tensor occupies the same memory asy
. Unfortunately, PyTorch lets you make changes toy.detach()
ory.detach.numpy()
which will affecty
as well! If you want to safely mutate the detached version, you should usey.detach().clone()
instead, which will create a tensor in new memory.
RL Connection: You would want to be doing simulator-related tasks with numpy, convert to torch when doing model-related tasks, and convert back to feed output into simulator.
Tips for define networks and compute gradients
- Defined a class for our neural network (subclass of
nn.Module
) - Specified a loss function (MSE loss) and optimizer (Adam) --> make sure to pass all model parameters (especially for multimodal)
- Performed training by doing the following in a loop:
- Make prediction
- Compute loss
- Zero the stored gradients
- Backprop the loss with
.backward()
- Update the weights by taking a step of gradient descent