The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Generating Interpretable Diagnostic Explanations for Black-box AI Systems

Principal Investigator: Romila Pradhan

Artificial intelligence (AI) systems are increasingly deployed for decision-making in critical domains, such as healthcare, criminal justice, and finance. There is, however, growing concern that the opacity of these systems can perpetuate systemic biases and discrimination reflected in training data. Following increasing regulations by governmental agencies to generate human-understandable explanations for the behavior of these systems, the field of eXplainable Artificial Intelligence (XAI) witnessed a recent resurgence of interest. XAI tools are guided by social and ethical goals to: (a) increase societal acceptance of AI-based decision-making algorithms by establishing trust in outcomes, (b) provide users with actionable insights to change the results of algorithms in the future, and (c) enable the identification and debugging of sources of bias such as data collection strategies and training data that result in adverse and unexpected behavior. Existing approaches in XAI primarily focus on generating feature-based explanations that quantify the extent to which input feature values are responsible for the predictions of a model. These explanations suffer from severe limitations including the inability to capture causal relationships between variables and generating interventions that are not actionable in the real world. Furthermore, feature-based explanations are insufficient for generating diagnostic analyses that let practitioners trace unexpected or discriminatory algorithmic behavior back to training data. The algorithmic decisions could result from data errors and biases introduced during different stages of the analysis pipeline, such as data collection and preparation.

This project aims to address the limitations of existing approaches in XAI by developing novel techniques based on concepts from data management and machine learning to generate explanations for AI-based decision-making algorithms. We propose a two-pronged approach to reconcile the aforementioned objectives of XAI: first, we develop a novel framework based on probabilistic contrastive counterfactuals to provide insights into what causes the decisions of a black-box AI system, and second, we generate interpretable, diagnostic explanations for unexpected or discriminatory decisions made by the black-box AI system.

Personnel

Other PIs: Babak Salimi (University of California San Diego), Boris Glavic (Illinois Institute of Technology, Chicago)

Representative Publications

Keywords: Causal explanations, Counterfactual explanations, Explainable AI, Recourse