Motivation
in multiple dimensions a maxout unit can approximate arbitrary convex functions
Contributions
- maxout is cross channel pooling
- maxout enhances dropout’s abilities as a model averaging technique.
- Dropout is generally viewed as an indiscriminately applicable tool that reliably yields a modest improvement in performance when applied to almost any model.
Experiments
better performance
(Rectifier units do best without cross-channel pooling but with the same number of filters, meaning that the size of the state and the number of parameters must be about k times higher for rectifiers to obtain generalization performance approaching that of maxout.)The activations of maxout units are not sparse
Model averaging better