Towards Robust Deep Learning

Glazunov, Misha

Towards Robust Deep Learning

Deep Latent Variable Modeling against Out-of-Distribution and Adversarial Inputs

Doctoral thesis (2025)

Authors

Misha Glazunov Cyber Security

Contributors

RL Lagendijk Cyber Security (promotor)

David M.J. Tax Pattern Recognition and Bioinformatics (copromotor)

Research Group

Cyber Security

Outlier Detection Adversarial Examples Out-of-Distribution Deep Learning Robustness Latent Representation

To reference this document use:

http://resolver.tudelft.nl/uuid:beacb059-da33-42ed-a094-8d7674f2ec26

More Info

expand_more

Published Date

2025

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Research Group

Cyber Security

Abstract

As Deep Neural Networks (DNNs) continue to be deployed in safety-critical domains, two specific concerns — adversarial examples and Out-of-Distribution (OoD) data — pose significant threats to their reliability. This thesis proposes novel methods to enhance the robustness of deep learning by detecting such inputs and mitigating their impact.

A central insight of this work is that algorithmic stability plays a crucial role in generalizing to in-distribution data. Motivated by this, the thesis formulates a dual perspective on stability with respect to the hypotheses and explores whether this perspective facilitates the separation of problematic inputs under two main lenses: epistemic uncertainty estimation and the choice of an appropriate inductive bias. By grounding our approach in generative modeling with a latent variable based on an information bottleneck and, specifically, employing Variational Autoencoders (VAEs), we first leverage Bayesian inference over model parameters to estimate the model’s uncertainty with respect to a particular input. Second, we investigate the required properties of both VAE maps and latent representations from a topological perspective. This reveals how OoD inputs predominantly map onto empty regions — or “holes” — in the latent manifold. Finally, we discover that adversarial examples likewise exhibit similar behavior. This finding is then used to craft new scoring functions that reliably distinguish between inliers, outliers, and adversarial attacks.

Files

Towards_Robust_Deep_Learning_M... (pdf)

(pdf | 16.2 Mb)

License info not available

Propositions_Misha_Glazunov.pd... (pdf)

(pdf | 0.14 Mb)

License info not available