Background
Visual servoing enables robots to interact with dynamic environments using real-time visual feedback. However, deploying such systems in practical settings is often constrained by computational load, system latency, and lack of modularity. With recent developments in embedded AI hardware (e.g., NVIDIA Jetson), it's now feasible to offload deep learning-based perception to low-power, embedded devices, making robotic systems more scalable, mobile, and robust.
Precision tasks like peg-in-hole insertion and building block assembly are foundational in robotic manufacturing, assembly lines, and educational robotics. These tasks challenge robotic systems to perform fine manipulation under uncertainty, requiring real-time perception, precise control, and robust error handling. Traditional control approaches rely on pre-calibrated setups and struggle with small variances in object position or shape.
With the rise of embedded AI and low-power neural inference hardware, it becomes possible to implement real-time visual servoing systems directly on the robot. This enables smarter, more adaptable robots capable of executing tight-tolerance insertion or stacking tasks in dynamic environments.
Objective
To develop a modular embedded AI-based visual servoing system for performing peg-in-hole and building block assembly tasks using a robotic manipulator. The system should combine vision-based perception, deep learning inference, and closed-loop control, running entirely on embedded hardware.
Key Tasks
- Literature Review
- Review visual servoing and learning-based insertion/assembly strategies.
- Study classical vs. learning-based peg-in-hole methods.
- Analyze embedded AI platforms for visual perception and control.
- System Architecture Design
- Design a modular system where visual perception, control, and feedback are separable components.
- Select appropriate embedded hardware (e.g., NVIDIA Jetson).
- Visual Perception & Pose Estimation
- Train or adapt lightweight object detection and pose estimation models for identifying peg, hole, and block positions.
- Deploy optimized models on embedded hardware for low-latency inference.
- Visual Reasoning and Motor Control Coordination
- Develop logic for interpreting visual inputs into discrete or continuous motor commands.
- Implement error-aware reasoning to handle visual ambiguities (e.g., partial occlusion, misalignment).
- Coordinate perception with trajectory updates for adaptive, incremental corrections during manipulation.
- Robotic Task Execution
- Peg-in-hole insertion with different tolerances and angles
- Building blocks stacked with precision
- Perform real-world experiments using a robotic arm and camera system for:
- Integrate force feedback (optional) to enhance insertion success.
- Benchmarking & Evaluation
- Compare embedded AI solution to baseline (e.g., open-loop or vision-only without feedback).
- Evaluate task success rate, insertion time, precision, and latency.
- Documentation
- Deliver a comprehensive thesis with implementation details, experimental results, and recommendations.
Expected Outcomes
- A working embedded AI visual servoing system for robotic insertion and block assembly tasks.
- Quantitative comparison of performance across task complexity, environmental variations, and insertion tolerances.
- A reusable framework for adaptive, embedded robotic manipulation in constrained tasks.
Requirements
- Strong interest in robotics, computer vision, and embedded AI.
- Experience with Python/C++, deep learning, and platforms such as ROS, PyTorch, or TensorFlow.
- Basic experience with robotic arm programming and control.