Influence-directed Explanations for Machine Learning

Sen, Shayak

doi:10.1184/R1/16860118.v1

CMU-CS-18-107.pdf (3.06 MB)

Influence-directed Explanations for Machine Learning

thesis

posted on 2021-10-22, 19:51 authored by Shayak SenShayak Sen

Increasingly, decisions and actions affecting people’s lives are determined by automated systems processing personal data. Excitement about these systems has been accompanied by serious concerns about their opacity and the threats that they pose to privacy, fairness, and other values. Recognizing these concerns, it is important to make real-world automated decision-making systems

accountable for privacy and fairness by enabling them to detect and explain violations of these values. System maintainers may leverage such accounts to repair systems to avoid future violations with minimal impact on the utility

goals. In this dissertation, we provide a basis for explaining how machine learning systems use information. These explanations increase trust in the functioning of the system, allowing us to verify that they make not only right decisions

but also for justifiable reasons. Further, explanations can be used to support detection of privacy and fairness violations, as well as explain how they came about. We can then leverage this understanding to repair systems to avoid

future violations. We identify two major challenges to explaining information use in machine learning systems: (i) converged use, that machine learning systems typically

combine a large number of input features, and (ii) indirect use, that these systems can typically infer and use information that is not directly provided to the system. Our approach to explaining how complex machine learning models use information involves answering two questions: (influence) Which factors were influential in determining outcomes?, and (interpretation) What do these factors mean? We first present key results measuring the causal influence of factors in machine learning models. We then examine the following settings: (i) systems with potential indirect use of information, and (ii) convolutional neural

networks. For each setting we demonstrate how influence and interpretation combine to account for information use.

History

Date

2018-05-10

Degree Type

Dissertation

Department

Computer Science

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Anupam Datta

Usage metrics

Keywords

machine learning

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Influence-directed Explanations for Machine Learning

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports