Autonomous exploration and mapping of unseen environments is a challenging topic for mobile robots. It involves several non-trivial tasks such as observing the environment by utilizing sensory information, localization, abstracting sensory data into useful information and leveraging that information to navigate and map the environment using collision-free paths.
There are several methods in literature to address these problems, however, most methods are not suitable for real-life applications where the environment is complex and dynamic. They either require complex logic about obstacles and a priori information about the environment or require human assistance in order to achieve the goals.
A relatively new approach in the field of mobile robots is reinforcement learning (RL). RL can be described as a paradigm where an agent learns from it's environment by taking actions and receiving some predefined reward based on how good that action was. The goal of RL can be described by the maximization of this expected cumulative reward. In order for this to work, the environment has to be described as a Markov Decision Process (MDP). If this is possible, we can solve the MDP and navigate the environment successfully by finding an optimal behaviour policy using iterative solution methods. In addition to RL, Simultaneous localization and Mapping (SLAM) techniques will be used to construct a map of the environment and help the robot localize within the given map.
The main goal of this project is to construct a map and navigate an unseen environment using RL and SLAM approach with the constraint of utilizing no a priori knowledge about the environment. To achieve this a Deep deterministic Policy Gradient algorithm will be used in a realistic simulation environment.