Attention-based and Causal Explanations in Computer Vision
Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Attention-based and Causal Explanations in Computer Vision

Abstract

Despite their potential unknown deficiencies and biases, the takeover of critical tasks by AI machines in different fields has created a demand for transparency alongside accuracy for these machines. Explainable AI (XAI) approaches have provided solutions by mitigating the lack of transparency and trust in AI and making these machines more interpretable to the lay users. This dissertation investigates the role of explanations for deep learning models in computer vision. This research explores new methods to produce more effective explanations for such models and techniques to evaluate the efficacy of such explanations. The evaluation methods rely on extensive user studies as well as automated approaches. Throughout the study, we implement such XAI systems for complex tasks with potential for bias, such as Visual Question Answering (VQA) and face image classification.We present explainable VQA systems that generate interpretable explanations using spatial and object features driven by attentional processes such as transformers. Our user studies show that exposure to multimodal explanations improves lay user’s mental, mainly when AI is erroneous. In these studies, we demonstrate the role of object features in enhancing the explainability and interpretability of such models. Furthermore, we examine automated techniques to provide controlled counterfactual explanations more successfully than merely displaying random examples. To provide counterfactual examples, we compare an automated generating method versus a retrieval-based approach. Results indicate an overall improvement in users’ accuracy to predict answer change when shown counterfactual explanations. While realistically retrieved counterfactuals are the most effective at improving the mental model, this study shows that a generative approach can also be equally effective. For the task of face image classification, modern models tend to be prone to potential biases that can cause ethical issues in different applications despite their high accuracy. We introduce a novel method to search for causal yet interpretable counterfactual explanations using pretrained generative models. The proposed explanations show how different attributes influence the classifier output, with contrastive counterfactual images as local explanations and causal sufficiency/necessity scores as global explanations.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View