Maxplain – Value-based Evaluation of Explainable AI Techniques
More Info
expand_more
Abstract
A 2022 Harvard Business Review report critically examines the readiness of AI for real-world decision-making. The report cited several incidents, like an experimental healthcare chatbot suggesting a mock patient commit suicide in response to their distress or when a self-driving car experiment was called off after it resulted in the death of a pedestrian.
These incidents, leading to media frenzies and public outcries, underscore a pressing concern: "How do these AI systems reach their conclusions?" It has created an urgent demand for transparency and clarity in AI decision-making processes. This urge to understand has translated into a significant uptick in the volume of work in Explainable AI (XAI). This makes it crucial to have consistent evaluation standards for streamlined growth in the field.
However, XAI, being a multidisciplinary field, faces the challenge of a lack of consensus on what constitutes a "good" explanation. Stakeholders with diverse backgrounds and needs can have diverging expectations from XAI. Some might prioritize simple and concise explanations, while others prioritize detailed information about AI predictions, all depending on their end goal.
This thesis addresses the standardization of an evaluation framework for XAI methods, that accounts for stakeholders' needs in different usage contexts. It presents a prototype that can be customized and extended to suit various XAI methods and tasks. Findings affirm the framework’s ability to yield insightful comparisons between different XAI methods. It also highlights issues with human perception of specific XAI features in those methods. The efforts in this work contribute to XAI techniques being integrated into real-world applications, ensuring more reliable and consistent performance assessment.