LEARNING TO CONTROL VISUAL ABSTRACTIONS FOR STRUCTURED EXPLORATION IN DEEP REINFORCEMENT LEARNING
ABSTRACT
Exploration in environments with sparse rewards is a key challenge for reinforcement learning. How do we design agents with generic inductive biases so that they can explore in a consistent manner instead of just using local exploration schemes like epsilon-greedy? We propose an unsupervised reinforcement learning agent which learns a discrete pixel grouping model that preserves spatial geometry of the sensors and implicitly of the environment as well. We use this representation to derive geometric intrinsic reward functions,