Improving Defenses against Backdoors in Federated Learning using Data Generation
More Info
expand_more
Abstract
In this work, we propose a general solution to address the non-IID challenges that hinder many defense methods against backdoor attacks in federated learning. Backdoor attacks involve malicious clients attempting to poison the global model. While many defense methods effectively filter out these malicious clients using clustering techniques, their effectiveness diminishes when the federated learning process involves non-IID datasets. In such cases, clustering methods struggle to distinguish between benign and malicious clients due to the inherent variability in the clients' data distributions.
Our proposed solution leverages data generation to mitigate the non-IID nature of clients' local datasets. By generating synthetic data, the datasets become more IID, enabling defense methods to once again effectively counter backdoor attacks. Evaluations are carried out on standard datasets in the image classification fields, like MNIST and CIFAR-10. The results show that the data generation solution can effectively improve the performance of defense methods and filter out malicious clients again. Although the generated data samples may suffer from low quality and limited diversity due to constraints in training the generative adversarial networks (GANs), our approach demonstrates significant improvements in defending against backdoors.