A counterfactual-based evaluation framework for machine learning models that use gene expression data
More Info
expand_more
Abstract
The evaluation metrics commonly used for machine learning models often fail to adequately reveal the inner workings of the models, which is particularly necessarily in critical fields like healthcare. Explainable AI techniques, such as counterfactual explanations, offer a way to uncover a model’s internal process. However, these explanations are in literature often used for recourse actions rather than for testing a model’s internal mechanism. In this paper, we propose a proof of concept for a framework which uses counterfactual explanation to evaluate the inner workings of biological machine learning models that use gene expression data. Our approach involves comparing the change of gene expression observed in the original data to the change of gene expression observed between the factual and counterfactual data. The change of gene expression is quantified using the log fold change. Additionally, we expand the definition of faithfulness and introduce a new metric that measures how faithful the generated counterfactual explanations represent the model. This metric should ensure that the explanations accurately reflect the model’s true internal process.