State representation learning using a graph neural network in a 2D grid world

Reinforcement learning algorithms have shown great success in solving complicated robotics tasks. These tasks often involve multiple sensors which generate a high dimensional sensory input, i.e. observation. Learning the optimal policy from the high dimensional observation directly often requires processing large amounts of data. State representation learning aims to map the high dimensional observation to a lower-dimensional state space, in order to reduce training time and the required amount of data.

In this work, the goal is to effectively use a graph neural network in state representation learning. The aim is to learn useful state representations in a self-supervised setup. The focus is on a navigation task in a 2D deterministic grid world environment. This environment has clear objects which can be encoded as a graph.

By simultaneously training an encoder network and a transition model in latent space, a state representation is learned. The transition model is implemented using a graph neural network. Additionally, a reward model is trained. By using the learned state representation together with the trained reward and transition model, a policy can be learned in latent space. The comparison is made with a conventional neural network.

To join the presentation via Microsoft Teams click here