Safe Robot Learning

MSc assignment

To facilitate the transition of robots from closed industrial cages to highly unpredictable environments, such as domestic or natural settings, we must equip them with the capability to execute optimal tasks while ensuring safety. Reinforcement Learning (RL) [1] can be used to compute optimal strategies for accomplishing difficult tasks in complex scenarios. However, a significant drawback of RL is the lack of safety guarantees [2]. In Constrained reinforcement learning (C-RL) [3,4,5] the agents must learn a task while satisfying constraints on expectations of auxiliary costs. This scheme has gained significant interest recently due to the possibility to encode safety constraints.

The assignment focuses on a C-RL scheme with energy limitations, since energy flows contain information regarding metabolic spending, performance, and safety. The objective is to investigate the impact of this formulation in comparison to standard RL schemes within a safety context. The work includes studying RL fundamentals, reviewing literature on C-RL, implementing a C-RL algorithm, and conducting theoretical and practical explorations through simulation and/or on a real robot.

[1] Naeem, Muddasar, Syed Tahir Hussain Rizvi, and Antonio Coronato. "A gentle introduction to reinforcement learning and its application in different fields." IEEE access 8 (2020): 209320-209344.
[2] Dulac-Arnold, Gabriel, et al. "Challenges of real-world reinforcement learning: definitions, benchmarks and analysis." Machine Learning 110.9 (2021): 2419-2468.
[3] Liu, Yongshuai, Avishai Halev, and Xin Liu. "Policy learning with constraints in model-free reinforcement learning: A survey." The 30th international joint conference on artificial intelligence (ijcai). 2021.
[4] Yu, Dongjie, et al. "Reachability constrained reinforcement learning." International Conference on Machine Learning. PMLR, 2022.
[5] Wachi, Akifumi, and Yanan Sui. "Safe reinforcement learning in constrained markov decision processes." International Conference on Machine Learning. PMLR, 2020.