Generalization capabilities of a learning from demonstration framework capturing a human controller

Finished: 2023-04-06

MSc assignment

In 2013 the WHO estimated the shortage of healthcare workers at about 17.4 million worldwide. The projected shortage of healthcare workers is in 2030 still more than 14 million. This indicates that a shortage of healthcare staff was already there before the corona pandemic. The pandemic only emphasized the shortages of healthcare staff. A robotic system could decrease the needs-based shortage of healthcare workers and elevate the workload of ICU personnel by taking over tasks. Examples of these tasks are aiding with dressing, putting patients to bed, rehabilitation and/or routine checks. Humanoid robots with these objectives have already been designed. However, these robots are currently limited in their operations, because the robot operates best in a stable and predictable environment. The healthcare sector on the other hand is not a stable and predictable environment. It is therefore important for the robot to be able to quickly adapt/learn in such new/unpredictable environments.

A possible solution may be through learning from demonstration (LFD). In LFD, the robot should be able to learn without the use of a programming language. The robot learns from an operator who demonstrates how to perform a movement. The movement could include interaction with elements. The learned movement and interaction can be combined with for example object detection, such that the system can act fully autonomous. The motion demonstration could be done by actively moving the joints f the robot, a motion capture system (or something similar) or teleoperation. A combination of the aforementioned demonstration method is also possible. The main advantage of LFD is that an unskilled person can teach the robot something new. In other words, the robot can learn without the aid of an engineer, which would allow for an easier and quicker learning method compared to traditional learning methods.

The goal of this master thesis is to investigate the influence of force information on the quality of learning while using learning from demonstration. The main research question will be something in the line of: What is the influence of force information on the quality of learning from demonstration with teleoperation as demonstration framework? The demonstration interface is the already existing teleoperation framework of i-BOTICS from the Xprize. This framework controls the EVE bilateral (meaning there is two-way communication between the EVE and operator) via a haptic device. The learning framework on the robot should be still developed for the EVE. The learning framework will be based on the state of the art behaviour cloning algorithm. This means there is no reinforcement learning involved. Ideally, an existing framework mentioned in the literature will be used and adapted to the situation of the EVE. However, if this is not possible a learning framework has to be designed for the EVE and evaluated.

The experiment to answer the research question will consist of two tasks. These tasks still need to be designed, however, the following requirements should be met: in both experiments, there are two situations, one in which the force information is fed into the learning process and one in which the force information is not. In the first experiment, the addition of force information should have, based on rational reasoning, a positive effect on the learning process. In the second experiment, the force information does not necessarily have an effect on the learning process. In this way, the influence of force can be evaluated for a trivial and less trivial case, because it is expected that more information should result in better learning. The experiments will be evaluated on generalizability and correctness. Based on the evaluation a conclusion on the addition of force information will be drawn. It is important to mention that there is no vision framework available, meaning that if positions are needed this will be hardcoded. It would still allow the robot ot have an idea of where something is located.