Carnegie Mellon University
Browse

Guiding Machine Learning Design with Insights from Simple Sandboxes

Download (21.85 MB)
thesis
posted on 2025-06-24, 18:23 authored by Bingbin LiuBingbin Liu
<p dir="ltr">Machine learning research often follows two seemingly distinct approaches: the empirical approach, which excels at developing practical algorithms, and the theoretical approach, which offers formal guarantees and resource-efficient solutions. While the empirical approach often relies on heuristics and demands costly large-scale experiments, the theoretical approach often hinges on unrealistic assumptions, limiting its applicability to real-world scenarios.</p><p dir="ltr"> This thesis aims to bridge these approaches by studying “sandbox” setups, which are conceptual abstractions of complex systems. A well-designed sandbox is both minimal, enabling clean theoretical analyses and rapid, accessible empirical investigations, and representative, ensuring that findings within the sandbox are generalizable to broader contexts. </p><p dir="ltr">This thesis details the use of the sandbox approach to understand the task design, the model class, and the learning process. Chapter 2 examines design choices in machine learning tasks, focusing on how self-supervised methods—namely, contrastive learning and masked prediction—extract information from sequential data. Chapter 3 analyzes the capabilities and limitations of a specific model class, with an emphasis on Transformers for sequential reasoning. This chapter characterizes the feasible solutions, discusses generalization challenges, and proposes improvements with implications on in- terpretability. Finally, Chapter 4 examines factors that impact the learning process. It identifies and addresses an algorithmic challenge in contrastive learning, and explores how knowledge distillation can improve sample complexity.</p>

Funding

TAS::97 0400::TAS XRL: EXPLAINABLE REINFORCEMENT LEARNING FOR AI AUTONOMY

United States Department of the Air Force

Find out more...

RI: Small: Non-parametric Machine Learning in the Age of Deep and High-Dimensional Models

Directorate for Computer & Information Science & Engineering

Find out more...

CAREER: Theoretical Foundations of Modern Machine Learning Paradigms: Generative and Out-of-Distribution

Directorate for Computer & Information Science & Engineering

Find out more...

History

Date

2025-06-01

Degree Type

  • Dissertation

Thesis Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Andrej Risteski Pradeep Ravikumar

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC