Data Augmentation Techniques using Generative Adversarial Neural Networks on Side Channel Analysis

More Info
expand_more

Abstract

Side-channel Attacks can be performed in various ways by measuring the power consumption, the electromagnetic emission, or even by measuring an algorithm's execution time on the targeted device. More or less sophisticated methods can be used to utilize this information in order to perform a Side-channel attack, more specifically by training Machine Learning models with traces acquired from a profiling target or by directly performing statistical analysis to a black box target. Of course, performing a successful attack in a black box scenario requires large amounts of traces, even for unprotected implementations. On the other hand, with the state of the art Machine Learning methods, a successful Side-channel attack can be performed by acquiring a smaller number of traces. Nevertheless, for every case, in order to perform a Side-channel Attack, first traces have to be acquired from the target. This acquisition phase is a time-consuming and computationally expensive procedure that requires hardware tools and human supervision. For this reason, in this thesis, we investigate if synthetic Side-channel traces can be generated when the Hamming Weight (HW) model is used as a power model. To generate synthetic traces, we use Generative Adversarial Neural Networks (GANs), a Deep Learning approach that is the state-of-the-art augmentation method in many other fields, especially in Computer Vision. Therefore, we introduce a pipeline to define the architecture of the GANs but also to improve and evaluate it. In other words, we investigate novel ways to introduce new software tools to minimize the time and processing power needed to perform a successful Side-channel attack. Since the HW model is used, the Side-channel datasets follow a specific binomial class distribution, and by using GANs, we can augment the dataset by altering this distribution according to our will. For this reason, before performing data augmentation, we investigate how the profiling models' performance is affected by the trace's class distribution. Next, we use our pipeline to define two GAN architectures: a shallow one able to generate high-quality synthetic traces for unprotected implementations and an enhanced second architecture that can generate synthetic traces even when mild countermeasures such as jittering are in place, by being more stable but also more computationally expensive. We apply the pipeline's methodology on unprotected datasets and on datasets with mild countermeasures such as jittering and we made some steps leveraging GANs to solve the task of augmenting dataset that consist of Side-channel traces. In this way, we managed to compare the GANs with the well established Synthetic Minority Oversampling Technique (SMOTE), and we observed that GANs achieved similar performance with SMOTE and, in some cases, better. We apply the pipeline's methodology on unprotected datasets and on datasets with mild countermeasures such as jittering and we made some steps leveraging GANs to solve the task of augmenting dataset that consist of Side-channel traces. In this way, we managed to compare the GANs with the well established Synthetic Minority Oversampling Technique (SMOTE), and we observed that GANs achieved similar performance with SMOTE and, in some cases, better. This thesis showed that GANs can generate high-quality synthetic traces for unprotected implementations and implementations with mild countermeasures, which means that there is space for future investigation in this direction that might lead to more sophisticated architectures able to perform data augmentation in the Side-channel Analysis field.