PocketReplica: One-Loop AR Object Capture for On-Device 3D Reconstruction

MSc assignment

Walk once around an object in AR and “steal its geometry” into your pocket.
SAM 3 has shown that a single image can hint at a plausible 3D shape—but only from one view, on a desktop GPU, and with plenty of hidden assumptions. This project asks a bolder question:
What if you put the camera into a human hand, move around the object in AR, and let feature tracking + segmentation + geometry do the rest?

You hold up your phone or AR headset, pick an object on your desk—a toy figure, a coffee cup, a small sculpture. A tap tells the system: “this is the thing I care about.” From that moment on, every step you take around it becomes a new clue:

  • AR pose tracking says where you are,
  • feature matching says how images line up,
  • a SAM-style segmenter says which pixels belong to the object,
  • and a lightweight reconstruction engine slowly carves a clean 3D mesh out of the visual stream.

No calibration targets. No turntables. No studio lighting. Just you walking around the object once, and an AR system that understands both what to reconstruct and how your viewpoint moves.

The end result is a scale-aware, watertight 3D model you can spin, relight, or export for printing, teaching, product design, or interactive experiences. The scientific core is not “yet another demo app,” but a tight integration of:

  • Multi-view segmentation in the wild (making SAM-like masks consistent across many viewpoints)
  • Feature-based pose and structure reasoning (tracking the object as you move, not just the background)
  • On-device 3D reconstruction tailored for AR use (no cloud pipeline, no offline photogrammetry)

Where SAM 3 stops at “one image → one plausible 3D,” your thesis aims at “one walk around → one trustworthy, AR-ready model.” The physical world becomes editable, one loop at a time.