End-to-end robotic visual servoing using latent diffusion from AI methods

MSc assignment

Visual servoing is the set of methods that leverage real-time image feedback from cameras to guide the robot to a desired target view. Classical visual servoing approaches rely on feature extraction, therefore the target needs to be visible. Diffusion models, which are generative AI models able to generate trajectories conformant to those in the training dataset, can be used to overcome the limitations of classical visual servoing, by encoding the information on the target location in the latent representation used to represent the state in a compact form. However, the complexity of the model makes it computationally hard to generate low-level controls that can be used in real-time in an end-to-end fashion.

This thesis investigates the approaches to speed-up the generation process of the latent diffusion models, exploring both solutions from classical optimization theory, e.g. considering warm-start strategies, and recent machine learning advancements, e.g. leveraging the research on novel and computationally efficient diffusion models, such as the Denoising Diffusion Implicit Models. The ultimate goal is to perform end-to-end visual servoing on-board UAVs. This thesis falls within the scope of the European AutoASSESS project aimed at developing UAVs that can autonomously perform contact aerial inspection in ballast water tanks of cargo ships.

Expected outcomes:

  • A computationally efficient visual servoing approach using latent diffusion
  • Simulation-based results showing the efficacy of the proposed approach
  • If feasible within the project timeframe, deployment and testing of the developed framework in real-world experiments

Prerequisites:
The student working on the project should have

  • good programming skills in python and pytorch,
  • be familiar with ROS, and
  • hold the fundamentals in machine learning approaches.
  • Knowledge in control for UAVs is encouraged but not required.

If you are interested and would like to apply for this thesis subject please contact Barbara Bazzana and Antonio Franchi  to arrange for an interview. Please send your CV, transcript of exams, and a motivation letter in the email in order for your application to be considered.