CMU-CS-18-107.pdf (3.06 MB)
Download file

Influence-directed Explanations for Machine Learning

Download (3.06 MB)
posted on 22.10.2021, 19:51 authored by Shayak SenShayak Sen
Increasingly, decisions and actions affecting people’s lives are determined by automated systems processing personal data. Excitement about these systems has been accompanied by serious concerns about their opacity and the threats that they pose to privacy, fairness, and other values. Recognizing these concerns, it is important to make real-world automated decision-making systems
accountable for privacy and fairness by enabling them to detect and explain violations of these values. System maintainers may leverage such accounts to repair systems to avoid future violations with minimal impact on the utility
goals. In this dissertation, we provide a basis for explaining how machine learning systems use information. These explanations increase trust in the functioning of the system, allowing us to verify that they make not only right decisions
but also for justifiable reasons. Further, explanations can be used to support detection of privacy and fairness violations, as well as explain how they came about. We can then leverage this understanding to repair systems to avoid
future violations. We identify two major challenges to explaining information use in machine learning systems: (i) converged use, that machine learning systems typically
combine a large number of input features, and (ii) indirect use, that these systems can typically infer and use information that is not directly provided to the system. Our approach to explaining how complex machine learning models use information involves answering two questions: (influence) Which factors were influential in determining outcomes?, and (interpretation) What do these factors mean? We first present key results measuring the causal influence of factors in machine learning models. We then examine the following settings: (i) systems with potential indirect use of information, and (ii) convolutional neural
networks. For each setting we demonstrate how influence and interpretation combine to account for information use.




Degree Type



Computer Science

Degree Name

  • Doctor of Philosophy (PhD)


Anupam Datta

Usage metrics