As various post hoc explanation methods are increasingly being leveraged to
explain complex models in high-stakes settings, it becomes critical to develop
a deeper understanding of if and when the explanations output by these methods
disagree with each other, and how such disagreements are resolved in practice.
However, there is little to no research that provides answers to these critical
questions. In this work, we introduce and study the disagreement problem in
explainable machine learning. More specifically, we formalize the notion of
disagreement between explanations, analyze how often such disagreements occur
in practice, and how do practitioners resolve these disagreements. To this end,
we first conduct interviews with data scientists to understand what constitutes
disagreement between explanations generated by different methods for the same
model prediction, and introduce a novel quantitative framework to formalize
this understanding. We then leverage this framework to carry out a rigorous
empirical analysis with four real-world datasets, six state-of-the-art post hoc
explanation methods, and eight different predictive models, to measure the
extent of disagreement between the explanations generated by various popular
explanation methods. In addition, we carry out an online user study with data
scientists to understand how they resolve the aforementioned disagreements. Our
results indicate that state-of-the-art explanation methods often disagree in
terms of the explanations they output. Our findings also underscore the
importance of developing principled evaluation metrics that enable
practitioners to effectively compare explanations.
Metrics
10 Record Views
Details
Title
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective
Creators
Satyapriya Krishna
Tessa Han
Alex Gu
Javin Pombra
Shahin Jabbari
Steven Wu
Himabindu Lakkaraju
Publication Details
arXiv (Cornell University)
Resource Type
Preprint
Language
English
Academic Unit
Computer Science (Computing)
Other Identifier
991021868720104721
Research Home Page
Browse by research and academic units
Learn about the ETD submission process at Drexel
Learn about the Libraries’ research data management services