Carnegie Mellon University
wenxuanz_phd_ri_2023.pdf (42.74 MB)

Generalizable Dexterity with Reinforcement Learning

Download (42.74 MB)
posted on 2023-11-16, 19:20 authored by Wenxuan ZhouWenxuan Zhou

Dexterity, the ability to perform complex interactions with the physical world, is at the core of robotics. However, existing research in robot manipulation has been focused on tasks with limited dexterity, such as pick-and-place. The motor skills of the robots are often quasi-static, have a predefined or limited sequence of contact events, and involve restricted object motions. In contrast, humans interact with their surroundings with dynamic and contact-rich manipulation skills, allowing us to perform a wider variety of tasks in a broader range of settings. 

This thesis explores using Reinforcement Learning (RL) to equip robots with generalizable dexterity. RL solves sequential decision-making problems modeled as Markov Decision Processes (MDPs). RL has shown remarkable success in many domains such as games, making it a promising technique for developing advanced manipulation skills. Our research advocates for the following thesis statement: Reconsidering how we frame the robotics problem as an MDP is effective and essential to achieve generalizable dexterity through RL. We examine three challenges when applying RL to manipulation and discuss our approaches to overcome them by reconsidering the MDP formulation. 

First, robot data is time-consuming and expensive to collect. To reuse robot data effectively, we propose an offline RL algorithm by constructing a latent action space of the MDP. In addition, we discuss a framework that effectively reuses robot data across environments with non-stationary dynamics. 

Second, robot dexterity is often assumed to be limited by the hardware design of the robot. We propose to enhance the robot’s dexterity beyond its hardware limitations by exploiting the external environment, showing dynamic and contact-rich emergent behaviors. We demonstrate that rethinking how we define the environment of the MDP is effective in improving robot dexterity with RL. 

Third, learning dexterous skills that can generalize is challenging. We propose an RL framework with an action representation that is spatially-grounded and temporally-abstracted which allows the robot to learn complex interactions that can generalize to unseen objects. This further supports our claim that rethinking the action space of the MDP can lead to generalizable dexterity. 




Degree Type

  • Dissertation


  • Robotics Institute

Degree Name

  • Doctor of Philosophy (PhD)


David Held

Usage metrics



    Ref. manager