Dissertation
A qualitative investigation of explanation goodness in a biomedical AI system
Doctor of Philosophy (Ph.D.), Drexel University
Jun 2024
DOI:
https://doi.org/10.17918/00010668
Abstract
Explaining the decisions of artificial intelligence (AI) systems is both a core challenge to implementing AI systems and the goal of explainable AI (XAI) as a sphere of research. Explanations in XAI are machine-generated, contextual, and must be evaluated accordingly--such context has been investigated by the case-based reasoning (CBR) community for decades. User studies are the gold standard for evaluating explanations in XAI. The need to systematically investigate explanation quality in XAI is an open problem. The dearth of and discrepancies observed in user studies, nascency of instruments, and lack of approaches for effective evaluation have left us with limited bases for measuring explanation quality in XAI. 'Goodness' is one concept used to investigate explanation quality in XAI, but there is no consensus on what a 'good' explanation in XAI entails. We completed a literature search and analysis of 'explanation quality' within the CBR literature to investigate how researchers from the CBR community measure the quality of explanations. We performed a scoping review of user studies in the XAI literature where qualitative data were collected to investigate the role of qualitative data in advancing XAI research. We conducted a user study to investigate how biomedical researchers evaluate the goodness of explanatory text statements created for a biomedical AI system. Using cognitive interviewing and the Explanation Goodness Checklist developed as part of the U.S. Defense Advanced Research Projects Agency's (DARPA's) XAI program, we presented biomedical researchers with four examples of explanatory text statements and asked the researchers to 1) think-aloud when reviewing each example, and 2) evaluate the goodness of the explanatory text statements using the Explanation Goodness Checklist, verbally elucidating the selections made. Upon completing and elucidating the selections made on the Explanation Goodness Checklist, we conducted a semi-structured interview with each biomedical researcher to elicit their reactions to the explanation goodness criteria and 'relevance' dimension of explanation goodness. The outcomes from our research include us i) identifying 'goodness,' 'relevance,' and 'usefulness' as the concepts used most often by researchers from the CBR community when measuring explanation quality; ii) providing the XAI community with a set of recommendations and resources that can be used for rigorous user-centered investigation; and iii) introducing an approach for evaluating the goodness of explanatory text statements created for biomedical AI systems.
Metrics
73 File views/ downloads
41 Record Views
Details
- Title
- A qualitative investigation of explanation goodness in a biomedical AI system
- Creators
- Adam J. Johs
- Contributors
- Rosina O. Weber (Advisor)Denise E. Agosto (Advisor)
- Awarding Institution
- Drexel University
- Degree Awarded
- Doctor of Philosophy (Ph.D.)
- Publisher
- Drexel University; Philadelphia, Pennsylvania
- Number of pages
- xv, 216 pages
- Resource Type
- Dissertation
- Language
- English
- Academic Unit
- Information Science (Informatics) (2013-2026); College of Computing and Informatics (2013-2026); Drexel University
- Other Identifier
- 991021890115404721