Continual Robot Learning: Benchmarks and Modular Methods

Powers, Samantha

doi:10.1184/R1/24438466.v1

Continual Robot Learning: Benchmarks and Modular Methods

thesis

posted on 2023-11-16, 18:47 authored by Samantha PowersSamantha Powers

Humans adapt continuously to the world around us, allowing us to acquire new skills and explore diverse environments seamlessly. Current AI methods, however, cannot attain this versatility. Instead, they are typically trained with vast datasets, and learn all tasks simultaneously. However, the trained models have limited ability to adapt to changing contexts, and are limited by available data. This challenge is particularly pronounced in robotics, where real world interaction data is scarce.

Instead, we envision a robot capable of continuously learning from both the environment and human interactions, quickly acquiring new information without overwriting past knowledge, and capable of adapting to a user’s specific needs.

In this thesis, we apply continual learning to robotics, with the goal of enabling crucial capabilities, including: the ability to apply prior information to new settings, maintain old information, sustain capacity for new skills, and understand context. We explore these across two learning modes: continual reinforcement learning (CRL), where the agent learns from experience, and continual imitation learning (CIL), where it learns from demonstrations.

However, substantial barriers hinder progress, including limited open-source resources, resource-intensive benchmarks, and impractical metrics for robotics. To address these challenges, we present CORA (COntinual Reinforcement Learning Agents), an open-source toolkit with benchmarks, baselines, and metrics to enhance CRL accessibility. CORA extends beyond catastrophic forgetting, evaluating models for forward transfer and generalization.

With this foundation, we introduce SANE (Self-Activating Neural Ensembles) to create a dynamic library of adaptable skills. SANE’s ensemble of independent modules learns and applies skills as needed, reducing forgetting. We demonstrate this method on several Procgen reinforcement learning task sets.

We then adapt SANE to a physical robot, the Stretch, with SANER (SANE for Robotics) using CIL. Leveraging our novel Attention-Based Interaction Policies (ABIP), SANER excels in few-shot learning, showcasing its effectiveness at generalization across various tasks.

SANERv2 further advances this capability, integrating natural language and achieving strong performance over a diverse set of 15 manipulation tasks in a simulated environment, RLBench. Remarkably, SANERv2 was also able to display the potential of independent modules, demonstrating that a node could be moved between agents without loss of performance, promising possible future composable ensembles.

History

Date

2023-09-29

Degree Type

Dissertation

Department

Robotics Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Abhinav Gupta

Usage metrics

Keywords

robot learning continual learning reinforcement learning imitation learning Adaptive Agents and Intelligent Robotics

Licence

CC BY-NC-SA 4.0

Continual Robot Learning: Benchmarks and Modular Methods

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports