Carnegie Mellon University
Browse
nrhineha_phd_robotics_2019.pdf (63.24 MB)

Jointly Forecasting and Controlling Behavior by Learning from High-Dimensional Data

Download (63.24 MB)
thesis
posted on 2019-10-18, 19:07 authored by Nicholas RhinehartNicholas Rhinehart
Achieving a precise predictive understanding of the future is difficult, yet widely studied in the natural sciences. Significant research activity has been dedicated to building testable models of cause and effect. From a certain view, the ability to forecast the universe is the “holy grail”;
the ultimate goal of science. If we had it, we could anticipate, and therefore (at least implicitly) understand all observable phenomena. The human capability to forecast offers complementary motivation. Critical to our intelligence is our ability to plan behaviors by considering how our
actions are likely to result in future payoff, especially in the presence of other collaborative and competitive agents. In this work, we seek to computationally model the future in the presence of agent behavior given rich observations of the environment. The brunt of our focus is to reason about what
agents could do, instead of other sources of stochasticity. This focus on future agent behavior allows us to tightly couple and jointly perform forecasting and control.
The field of Computer Vision (CV) is focused on designing algorithms to automatically understand images, videos, and other perceptual data. However, the field’s effort to-date focuses on non-interactive, present-focused tasks [79, 81, 158, 184]. Most CV contributions are algorithms to
answer questions like “what is that”, and “what happened”, rather than “what could happen”, or “how could I achieve X”. Computer Vision has under-explored reasoning about the interactive and decision-based nature of the world. In contrast, Reinforcement Learning (RL) prioritizes
modeling interactions and decisions by focusing on how to design algorithms to evoke behavior that maximizes a scalar reward signal. The resulting learning agents, in order to perform well, must have an understanding of how their current behaviors will affect their prospects of future
reward. However, in the dominant paradigm of model-free RL [218], agents reason implicitly about the future. In contrast, model-based RL learns one-step dynamics to estimate “what could happen in the near future”. Yet model-based RL primarily focuses on control, rather than explicitly forecasting a single agent (let alone multiple agents).
In this thesis, we consider the problem of designing algorithms to enable computational systems to (1) forecast future behavior of intelligent agents given rich observations of their environments, as well as to (2) use this reasoning for control. We believe these two problems should be tightly integrated and jointly considered, and use them to structure this thesis. We define forecasting to be the problem of estimating the set of possible outcomes of a system, whereas control is the problem of producing actions that generate a single outcome of a system. We often use
Imitation Learning and Reinforcement Learning to formulate and situate our work. We contribute forecasting and control approaches to excel in diverse, realistic, single-agent,
and multi-agent domains. The first part of the thesis focuses on progressively designing more capable forecasting models. We proceed through approaches to (1) forecast single actions of daily behavior by developing matrix factorization models [169], (2) forecast goal-driven action
trajectories of daily behavior by developing Online Inverse Reinforcement Learning models [168, 170], (3) forecast motion trajectories of vehicles by developing a deep reversible generative models [171, 174]. The second part of the thesis focuses on progressively designing more capable models that tightly couple forecasting and control. We discuss (4) forecasting as auxiliary supervision for
implicitly-planned control [228], (5) forecasting and explicitly planning with the same model [176], and (6) forecasting and planning future interactions of multiple agents [175].

History

Date

2019-09-30

Degree Type

  • Dissertation

Department

  • Robotics Institute

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Kris Kitani

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC