Carnegie Mellon University
Browse

Guiding Machine Learning Design with Insights from Simple Sandboxes

Download (21.85 MB)
thesis
posted on 2025-06-24, 18:23 authored by Bingbin LiuBingbin Liu

Machine learning research often follows two seemingly distinct approaches: the empirical approach, which excels at developing practical algorithms, and the theoretical approach, which offers formal guarantees and resource-efficient solutions. While the empirical approach often relies on heuristics and demands costly large-scale experiments, the theoretical approach often hinges on unrealistic assumptions, limiting its applicability to real-world scenarios.

This thesis aims to bridge these approaches by studying “sandbox” setups, which are conceptual abstractions of complex systems. A well-designed sandbox is both minimal, enabling clean theoretical analyses and rapid, accessible empirical investigations, and representative, ensuring that findings within the sandbox are generalizable to broader contexts.

This thesis details the use of the sandbox approach to understand the task design, the model class, and the learning process. Chapter 2 examines design choices in machine learning tasks, focusing on how self-supervised methods—namely, contrastive learning and masked prediction—extract information from sequential data. Chapter 3 analyzes the capabilities and limitations of a specific model class, with an emphasis on Transformers for sequential reasoning. This chapter characterizes the feasible solutions, discusses generalization challenges, and proposes improvements with implications on in- terpretability. Finally, Chapter 4 examines factors that impact the learning process. It identifies and addresses an algorithmic challenge in contrastive learning, and explores how knowledge distillation can improve sample complexity.

Funding

TAS::97 0400::TAS XRL: EXPLAINABLE REINFORCEMENT LEARNING FOR AI AUTONOMY

United States Department of the Air Force

Find out more...

RI: Small: Non-parametric Machine Learning in the Age of Deep and High-Dimensional Models

Directorate for Computer & Information Science & Engineering

Find out more...

CAREER: Theoretical Foundations of Modern Machine Learning Paradigms: Generative and Out-of-Distribution

Directorate for Computer & Information Science & Engineering

Find out more...

History

Date

2025-06-01

Degree Type

  • Dissertation

Thesis Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Andrej Risteski Pradeep Ravikumar

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC