Carnegie Mellon University
Browse

Understanding, Formally Characterizing, and Robustly Handling Real-World Distribution Shift

Download (12.74 MB)
thesis
posted on 2024-07-23, 16:42 authored by Elan RosenfeldElan Rosenfeld

Distribution shift remains a significant obstacle to successful and  reliable deployment of machine learning (ML) systems. Long-term  solutions to these vulnerabilities can only come with the understanding  that benchmarks fundamentally cannot capture all possible variation  which may occur; equally important, however, is careful experimenta tion with AI systems to understand their failures under shift in practice.  

This thesis describes my work towards building a foundation for trustworthy and reliable machine learning. The surveyed work falls  roughly into three major categories: (i) designing formal, practical char acterizations of the structure of real-world distribution shift; (ii) leverag ing this structure to develop provably correct and efficient learning algo rithms which handle such shifts robustly; and (iii) experimenting with  modern ML systems to to understand the practical implications of real world heavy tails and distribution shift, both average- and worst-case.  

Part I describes work on scalably certifying the robustness of deep  neural networks to adversarial attacks. The proposed approach can  be used to certify robustness to attacks on test samples, training data,  or more generally any input which influences the model’s eventual  prediction. In Part II, we focus on latent variable models of shifts,  drawing on concepts from causality and other structured encodings of  real-world variation. We demonstrate how these models enable for mal analysis of methods that use multiple distributions for robust deep  learning, particularly through the new lens of environment/intervention  complexity—a core statistical measure for domain generalization and  causal representation learning which quantifies error and/or structured  identifiability conditions as a function of the number and diversity of  available training distributions. Finally, in Part III we broadly explore  ways to better understand and leverage the variation in natural data,  and we show how the resulting insights can facilitate the design of new  methods with more robust and reliable real-world behavior.  

History

Date

2024-05-11

Degree Type

  • Dissertation

Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Andrej Risteski Pradeep Ravikumar