A counterfactual-based evaluation framework for machine learning models that use gene expression data

More Info
expand_more

Abstract

The evaluation metrics commonly used for machine learning models often fail to adequately reveal the inner workings of the models, which is particularly necessarily in critical fields like healthcare. Explainable AI techniques, such as counterfactual explanations, offer a way to uncover a model’s internal process. However, these explanations are in literature often used for recourse actions rather than for testing a model’s internal mechanism. In this paper, we propose a proof of concept for a framework which uses counterfactual explanation to evaluate the inner workings of biological machine learning models that use gene expression data. Our approach involves comparing the change of gene expression observed in the original data to the change of gene expression observed between the factual and counterfactual data. The change of gene expression is quantified using the log fold change. Additionally, we expand the definition of faithfulness and introduce a new metric that measures how faithful the generated counterfactual explanations represent the model. This metric should ensure that the explanations accurately reflect the model’s true internal process.

Files

Thesis_Marit_Radder.pdf
(pdf | 4.14 Mb)
Unknown license