Guiding Machine Learning Design with Insights from Simple Sandboxes
Machine learning research often follows two seemingly distinct approaches: the empirical approach, which excels at developing practical algorithms, and the theoretical approach, which offers formal guarantees and resource-efficient solutions. While the empirical approach often relies on heuristics and demands costly large-scale experiments, the theoretical approach often hinges on unrealistic assumptions, limiting its applicability to real-world scenarios.
This thesis aims to bridge these approaches by studying “sandbox” setups, which are conceptual abstractions of complex systems. A well-designed sandbox is both minimal, enabling clean theoretical analyses and rapid, accessible empirical investigations, and representative, ensuring that findings within the sandbox are generalizable to broader contexts.
This thesis details the use of the sandbox approach to understand the task design, the model class, and the learning process. Chapter 2 examines design choices in machine learning tasks, focusing on how self-supervised methods—namely, contrastive learning and masked prediction—extract information from sequential data. Chapter 3 analyzes the capabilities and limitations of a specific model class, with an emphasis on Transformers for sequential reasoning. This chapter characterizes the feasible solutions, discusses generalization challenges, and proposes improvements with implications on in- terpretability. Finally, Chapter 4 examines factors that impact the learning process. It identifies and addresses an algorithmic challenge in contrastive learning, and explores how knowledge distillation can improve sample complexity.
Funding
TAS::97 0400::TAS XRL: EXPLAINABLE REINFORCEMENT LEARNING FOR AI AUTONOMY
United States Department of the Air Force
Find out more...RI: Small: Non-parametric Machine Learning in the Age of Deep and High-Dimensional Models
Directorate for Computer & Information Science & Engineering
Find out more...CAREER: Theoretical Foundations of Modern Machine Learning Paradigms: Generative and Out-of-Distribution
Directorate for Computer & Information Science & Engineering
Find out more...History
Date
2025-06-01Degree Type
- Dissertation
Thesis Department
- Machine Learning
Degree Name
- Doctor of Philosophy (PhD)