Generative Adversarial Nets for generating synthetic Imaging Mass Spectrometry data

van der Linden, W.

Generative Adversarial Nets for generating synthetic Imaging Mass Spectrometry data

Master thesis (2024)

Authors

W. van der Linden Mechanical Engineering

Contributors

Raf Van De Plas Team Raf Van de Plas - Mechanical, Maritime and Materials Engineering (mentor)

G. Gleizer Team Sander Wahls - Mechanical, Maritime and Materials Engineering (graduation committee member)

R.A.R. Moens Team Raf Van de Plas - Mechanical, Maritime and Materials Engineering (coach)

Faculty

Mechanical Engineering, Mechanical Engineering

Synthetic Data Generation Unbalanced Data Generative Adversarial Nets (GANs) Imaging Mass Spectrometry (IMS)

To reference this document use:

http://resolver.tudelft.nl/uuid:9f3b18e7-dbb6-47bc-974f-fb9769e6410e

More Info

expand_more

Published Date

09-07-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Mechanical Engineering

Abstract

This report investigates the use of Generative Adversarial Nets (GANs) specifically for over-
sampling Imaging Mass Spectrometry spectra. IMS is a technique used to measure the spatial
distribution of molecules, which is valuable in fields like oncology and biomarker discovery.
GANs, on the other hand, are a class of machine learning frameworks where two neural net-
works, the generator, and the discriminator, are trained simultaneously through adversarial
processes. The generator creates synthetic data, while the discriminator tries to distinguish
between real and synthetic data.
GANs-based oversampling aims to increase classifier performance by adding data to classes
that are underrepresented in the original data. Synthetic oversampling is especially relevant
in IMS data as the measuring technique is destructive, making acquiring more real samples
impossible. GANs have been shown to outperform other oversampling techniques such as
SMOTE on various datasets. Applying GANs directly to the dataset proved unsuccessful in
this oversampling task.
Different possible causes of the limited performance of the GANs are studied leading to
improved experiment results using spectra reduced in dimension and the Wasserstein GANs
with gradient penalty. Even though with these changes to the experiment the GANs appear
to generate more realistic data, using this data for oversampling does not increase overall
classifier performance. Rather, it steers the classifier to overfitting towards the minority
classes.
This report demonstrates that applying the designed GANs for oversampling minority classes
on this dataset does increase classifier performance. However, it is shown that GANs can be
trained on IMS data and that GANs might be of use for applications with IMS data besides
oversampling.

Files

Thesis_WillemvanderLinden.pdf

(pdf | 4.03 Mb)

Unknown license