Influence of molecular structures on graph neural network explainers' performance

Stols, T.N.

Influence of molecular structures on graph neural network explainers' performance

Bachelor thesis (2024)

Authors

T.N. Stols Electrical Engineering, Mathematics and Computer Science

Contributors

M. Khosla (mentor)

J.M. Weber Pattern Recognition and Bioinformatics (mentor)

Thomas Abeel Pattern Recognition and Bioinformatics (coach)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Chemistry Graph neural network Explainability

To reference this document use:

http://resolver.tudelft.nl/uuid:bb68c3c6-11ab-435d-8f4b-763cb162590e

More Info

expand_more

Published Date

23-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

This study evaluates how the explainer for a Graph Neural Network creates explanations for chemical property prediction tasks. Explanations are masks over input molecules that indicate the importance of atoms and bonds toward the model output. Although these explainers have been evaluated for accuracy, no information exists on how faithful they are to the model (faithfulness), or how closely they correspond to human rationale (plausibility). Using explainability metrics to measure this, the per formance of the explainer is evaluated on different subsets based on the presence of benzene rings and halogens respectively, and on molecular weight. This study reveals that benzene rings influence the plausibility performance of the explainer, showing that performance is better at higher thresholds but worse at lower thresholds. Molecular weight and the presence of halogens do have no impact on plausibility. The ratio of positive samples in a set is shown to influence the metrics used for faithfulness. To ac
curately evaluate the faithfulness of different subsets, they should be changed to have equal positive rates or different metrics should be used. This research can be used as a starting point to research the influence of dataset properties on explainer performance. This is useful to create better explainers, leading to better acceptance of these models.

Files

BEP_16_.pdf

(pdf | 2.08 Mb)

Unknown license