Visualizing and Understanding
what’s going on in CNN
- First layer:weights, Filter, visual layers
- because when input similar to weights, result will be maximized
- higher layer filters:
- meaning less
- Last layer: NN in feature space
- distance near, semantic similar
- In loss, we didn’t contrict about space relation
Occusion (mask)
使用mask去看哪块对输出概率影响最大
Saliency Maps
which pixels matter for classification
output features
- compute gradient of (unnormalized) class score with respect to image image pixels, take absolute value and max over RGB channels
- segmentation without supervision
- K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. International Conference on Learning Representations Workshop, 2014.
- using grabcut on saliency maps
intermediate features
- compute gradient of neuron value with respect to image pixels
- images come out nicer if you only backprop positive gradients through each ReLU(guided backprop)
Gradient ascent
- fixed weights, tuning image pixels
Conclusion
Todos:
- read Grabcut