Deep 3D Geometric Reasoning for Robot Manipulation
To solve general manipulation tasks in real-world environments, robots must be able to perceive and condition their manipulation policies on the 3D world. These agents will need to understand various common-sense spatial/geometric concepts about manipulation tasks: for instance, that local geometry can suggest potential manipulation strategies, and that policies should adjust when object configurations are adjusted or camera perspectives shift. This thesis explores learning algorithms and visual representations that can imbue agents with geometric reasoning capabilities in a generalizable way while learning from only a small number of demonstrations or examples.
We first explore how agents can learn generalizable 3D affordance representations for articulated objects such as doors, drawers, etc. We propose a family of 3D visual representations that describe the motion constraints for every point on an articulated object. We demonstrate that when trained on a small dataset of simulated articulated objects, our family of 3D affordance representations generalizes zero-shot to novel in stances of seen object categories, entirely unseen object categories, objects perceived with real-world sensors, and objects that have fundamental ambiguities or uncertain ties.
Next, we explore how agents can learn task-critical geometric relationships for object rearrangement tasks from a small number of demonstrations. We design a family of dense 3D representations that can learn correspondence relationships across rigid and non-rigid objects, precisely extract desired rigid-body transformations using novel reasoning layers, and exhibit desirable invariance/equivariance properties under scene transformations. We also explore how these representations can be leveraged to solve sequential rearrangement tasks by integrating behavior cloning and planning.
We conclude with a discussion of the role of geometric reasoning in the broader robot learning landscape. We introduce some ongoing work that aims to help agents learn new geometric skills by watching human demonstrations, and propose possible future directions for geometric reasoning in robot learning.
Funding
Graduate Research Fellowship Program (GRFP)
Directorate for Education & Human Resources
Find out more...History
Date
2025-05-11Degree Type
- Dissertation
Department
- Robotics Institute
Degree Name
- Doctor of Philosophy (PhD)