On the Sequential Data Models in Side-Channel Analysis

RNN, LSTM, GRU Hyperparameters, Autoencoder and Embedding Layer

Master thesis (2020)

Authors

M.F. Mulders Electrical Engineering, Mathematics and Computer Science

Contributors

S. Picek Cyber Security - (mentor)

P.K. Murukannaiah Interactive Intelligence - (graduation committee member)

R.L. Lagendijk Cyber Security - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:871da1ae-3d9f-4644-8aa3-d94bf7be94c1

More Info

expand_more

Published Date

12-11-2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

A side-channel attack is performed by analyzing unwanted physical leakage to achieve a more effective attack on the cryptographic key. An attacker performs a profiled attack when he has a physical and identical copy of the target device, meaning the attacker is in full control of the target device. Therefore, these profiled attacks are known as the most powerful attacks in the side-channel analysis. This physical leakage is analyzed by machine learning and, in the last years, mostly deep learning, which both are used as a profiling tool to perform a side-channel attack. The best known deep learning technique for side-channel analysis at this moment is the convolutional neural network (CNN). However, this thesis investigates a well-known deep learning model that is never used before in side-channel analysis. The deep learning models RNN, LSTM, and GRU are tested and evaluated to look for the best hyperparameters. We show the influence of different models, amount of layers, dropout, activation function, units, recurrent dropout, and batch sizes in the experiments. We also show that using different sequence length gives a speedup in training. To reduce the sequence length, we use a linear regression technique. After that, we show that sequential data models are a suitable alternative for side-channel analysis; however, their results do not surpass the CNNs. After this, we experiment with an autoencoder as a preprocessing algorithm to "clean" noisy traces. We show that the LSTM autoencoder easily removes a hiding countermeasure with noise. However, a hiding countermeasure with delay is more challenging for the LSTM autoencoder. Combining both countermeasures seems impossible for the LSTM autoencoder. The performance we see when cleaning the traces also affects the guessing entropy. Lastly, we use an embedding layer as the first layer for MLP, CNN, and a sequential data model in the side-channel analysis. We experiment with different output dimensions and conclude that an embedding layer is a valid alternative to change the data dimension when using an MLP or a sequential data model.

Files

THESIS_3_.pdf

(pdf | 3.89 Mb)