Advancing Explainability in Black-Box Models

Bachelor thesis (2024)

Authors

İpek İşcan İşcan Electrical Engineering, Mathematics and Computer Science

Contributors

P. Altmeyer (mentor)

C.C.S. Liem (mentor)

B.J.W. Dudzik Pattern Recognition and Bioinformatics - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

XAI Explainability Black-box models

To reference this document use:

http://resolver.tudelft.nl/uuid:e50c1cae-d579-405a-9089-86a0ca925086

More Info

expand_more

Published Date

27-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

In recent years, the need for explainable artificial intelligence (XAI) has become increasingly important as complex black-box models are used in critical applications. While many methods have been developed to interpret these models, there is also potential in enhancing the models themselves to improve their inherent explainability. This paper investigates various techniques aimed at improving the explainability of black-box models. Through a systematic literature review, these techniques are categorized, and their impact on predictive uncertainty, adversarial robustness, and generative capacity is analyzed to understand how these factors contribute to the overall explainability. The snowballing methodology is used for the systematic literature review, starting with papers retrieved from four databases: IEEExplore, Scopus, ArXiv, and the ACM Digital Library to form the initial set. This process continued with backward and forward snowballing through four iterations, resulting in a total of 50 papers reviewed. Only papers focused on improving model explainability are included in the review. Due to time limitations, additional search constraints are applied for feasibility. The initial set of papers is filtered to those published since 2013. These constraints and their possible impacts are considered when interpreting the results. Findings reveal that techniques such as Bayesian approaches and variational inference, adversarial robustness, model compression and distillation, uncertainty and ensembles, regularization, self-explaining models, and hybrid techniques are used for advancing model explainability. The paper concludes with a discussion on the implications of these techniques for future research.

Files

CSE3000_FP.pdf

(pdf | 0.529 Mb)

Unknown license