- SVM:
It is very hard for me to use numpy and python at first time, a post may help us to solve the homework:
https://mlxai.github.io/2017/01/06/vectorized-implementation-of-svm-loss-and-gradient-update.html
- Softmax:
Not a easy way to understand the p and L w.r.t. what. Actually, L w.r.t. score i and the same as p.
See more on:
https://www.youtube.com/watch?v=mlaLLQofmR8
Remember: in this assignment, we do not implement any thing about SVM, we just use the linear SVM loss to train our linear classifier, it is a loss function, not a predict function.
Also, here softmax is a predict function, and we can also treat it as non-linear function/activation like ReLU. And we use cross entropy function to compute the loss of softmax
- Two-Layer Neural Network:
Why do not use softmax at the end when predict?
Because in softmax, each item’s dominant is the same, and the exponent function is a monotonous function, so we can just simply look up the max. item
Deride the gradient of each weight and bias:
I do not hope you all can figure it out, but maybe it can make some intuition