Carnegie Mellon University
Browse

Efficient Bayesian Experimental Design with Deep Learning

Download (29.63 MB)
thesis
posted on 2025-11-10, 22:14 authored by Conor IgoeConor Igoe
<p dir="ltr">Bayesian Experimental Design (BED) has emerged as an elegant formalism for understanding the value of different experimental designs when the cost of experimentation is non-negligible and efficient design is paramount. Notably, in recent years there has been a growing interest in adopting Deep Learning and Deep Reinforcement Learning(D(R)L)techniques to obtain effective experimental designs for BED tasks. Principal among the motivations for the involvement of these techniques is their potential to make highly informative experimental designs accessible without the need for prohibitive test-time computation. </p><p dir="ltr">Although the recent focus on D(R)L for BED has shown great initial promise, a core observation in this thesis is that training performant BED policies remains excessively challenging. In particular, we find that BED agents require a prohibitive number of samples to learn effective policies even for modest problem sizes. We attribute this learning difficulty to an explosion in possible posterior beliefs as inference progresses through an adaptive experiment, which makes generalization hard. Designing agents that can cope with this belief explosion efficiently has received little attention in the BED literature and is the primary focus of this thesis. </p><p dir="ltr">We demonstrate that standard state representations and architecture choices in the BED literature— such as fully connected networks, convolutional architectures, and Transformers—are poorly suited to efficient learning in the presence of belief explosion. To address this, we propose the use of equivariant networks that exploit the symmetries and structure inherent in BED tasks. We develop specialized equivariant architectures for both discrete belief state and continuous information set representations, and show that in both regimes, these networks significantly outperform standard baselines. Notably, these equivariant networks also exhibit robust generalization to new, larger BED domains at test time— capabilities that conventional architectures like Transformers fail to replicate. </p><p dir="ltr">Our investigation also reveals certain structural details of BED equivariances that may enable future work to improve sample efficiency even further. For example, in addition to global equivariances, we observearichsetofsubspaceequivariancesdeeperintoBEDtrajectories. Although our models operating on continuous information sets do not readily leverage these deeper subspace equivariances, we show that it is straightforward to leverage this structure with our discretized belief-space equivariant networks. </p><p dir="ltr">In addition to the BED setting, in the final section of this thesis we consider how equivariant networks can be trained to provide posterior predictive uncertainties using significantly improved sample efficiency compared to previous neural process models. In particular, we introduce the Graph Transformer Neural Process as a sample efficient model of stationary stochastic processes, and show how it is significantly more sample efficient compared to prior neural process models, as well as more robust to test time distribution shifts. </p><p dir="ltr">We conclude by outlining several remaining lines of future work—most notably, developing continuous information-set networks that preserve both global equivariances and the deeper subspace equivariances captured by our discrete model, overcoming policy optimization challenges in continuous BED q-function landscapes, and developing more robust strategies for amortizing BED policies across heterogeneous task families. By demonstrating the significance of equivariance in training sample-efficient BED policies, this work provides a principled foundation for scaling BED to more complex and structured task settings, where data efficiency and generalization remain critical constraints.</p>

History

Date

2025-08-01

Degree Type

  • Dissertation

Thesis Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Jeff Schneider

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC