Carnegie Mellon University
Browse

Offline Learning for Stochastic Multi-Agent Planning in Autonomous Driving

Download (9.71 MB)
thesis
posted on 2024-05-10, 19:44 authored by Adam VillaflorAdam Villaflor

Fully autonomous vehicles have the potential to greatly reduce vehicular accidents and revolutionize how people travel and how we transport goods. Many of the major challenges for autonomous driving systems emerge from the numerous traffic situations that require complex interactions with other agents. For the foreseeable future, autonomous vehicles will have to share the road with human drivers and pedestrians, and thus cannot rely on centralized communication to address these interactive scenarios. Therefore, autonomous driving systems need to be able to negotiate and respond to unknown agents that exhibit uncer?tain behavior. To tackle these problems, most commercial autonomous driving stacks use a modular approach that splits perception, agent forecasting, and planning into separately engineered modules. However, fully separating prediction and planning makes it difficult to reason how other vehicles will respond to the planned trajectory for the controlled ego-vehicle. So to maintain safety, many modular approaches have to be overly conservative when interacting with other agents. Ideally, we want autonomous vehicles to drive in a natural and confident manner, while still maintaining safety. 

Thus, in this thesis, we will explore how we can use deep learning and offline reinforce?ment learning to perform joint prediction and planning in highly interactive and stochastic multi-agent scenarios in autonomous driving. First, we discuss our work in using deep learning for joint prediction and closed-loop planning in an offline reinforcement learning (RL) framework (Chapter 2). Second, we discuss our work that directly tackles the difficulties of using learned models to do planning in stochastic multimodal settings (Chapter 3). Third, we discuss how we can scale to more complicated multi-agent driving scenarios like merging in dense traffic by using a Transformer-based traffic forecasting model as our world model (Chapter 4). Finally, we discuss how we can draw from offline model-based RL to learn a high-level policy that selects over a discrete set of pre-trained driving skills to perform effective control without additional online planning (Chapter 5). 

History

Date

2024-04-01

Degree Type

  • Dissertation

Department

  • Robotics Institute

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Jeff Schneider John Dolan

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC