Short explanation
rho is the "Gradient moving average [also exponentially weighted average] decay factor" and decay is the "Learning rate decay over each update".
Long explanation
RMSProp is defined as follows
So RMSProp uses "rho" to calculate an exponentially weighted average over the square of the gradients.
Note that "rho" is a direct parameter of the RMSProp optimizer (it is used in the RMSProp formula).
Decay on the other hand handles learning rate decay. Learning rate decay is a mechanism generally applied independently of the chosen optimizer. Keras simply builds this mechanism into the RMSProp optimizer for convenience (as does it with other optimizers like SGD and Adam which all have the same "decay" parameter). You may think of the "decay" parameter as "lr_decay".
It can be confusing at first that there are two decay parameters, but they are decaying different values.
- "rho" is the decay factor or the exponentially weighted average over the square of the gradients.
- "decay" decays the learning rate over time, so we can move even closer to the local minimum in the end of training.