Generalization by Visual Attention

Collé, B.J.

Generalization by Visual Attention

Bachelor thesis (2022)

Authors

B.J. Collé Electrical Engineering, Mathematics and Computer Science

Contributors

J.W. Böhmer Algorithmics (mentor)

C.B. Poulsen Programming Languages (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

CNN Out-of-distribution Transformer MHA Attention

To reference this document use:

http://resolver.tudelft.nl/uuid:44653bb4-bc7a-42c4-9621-7d12fb759c4b

More Info

expand_more

Published Date

24-06-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Most deep learning models fail to generalize in production. Indeed, sometimes data used during training does not completely reflect the deployed environment. The test data is then considered out-of-distribution compared to the training data. In this paper, we focus on out-of-distribution performance for image classification. In fact, transformers, which are a novel neural network architecture compared to the more traditionally used convolutional neural networks (CNN), have been shown to work well for image classification. This is why, in this paper, we firstly explore the different capabilities of both models on out-of-distribution. This is then followed by an in-depth investigation of individual architectural components of the transformer and their impact on the generalization capability of the model.

Files

Baptiste_colle_bachelor_thesis... (pdf)

(pdf | 2.43 Mb)

Unknown license