Joint Embedding Predictive Architecture for Self-supervised Pretraining on Polymer Molecular Graphs

More Info
expand_more

Abstract

Recent advancements in machine learning (ML) have shown promise in accelerating polymer discovery by aiding in tasks such as virtual screening via property prediction, and the design of new polymer materials with desired chemical properties. However, progress in polymer ML is hampered by the scarcity of high-quality, labelled datasets, which are necessary for training supervised ML models. In this work, we study the use of the very recent ’Joint Embedding Predictive Architecture’ (JEPA) type for self-supervised learning (SSL) on polymer molecular graphs, to understand whether pretraining with the proposed SSL strategy improves downstream performance when labelled data is scarce. By doing so, this study aims to shed light on this new family of architectures in the molecular graph domain and provide insights and directions for future research on JEPAs. Our experimental results indicate that JEPA self-supervised pretraining enhances downstream performance, particularly when labelled data is very scarce, achieving improvements across all tested datasets.

Files

Thesis_piccoli.pdf
(pdf | 19.4 Mb)
Unknown license