Tackling Data Non-IIDness in Multi-server Asynchronous FL

More Info
expand_more

Abstract

Federated Learning (FL) is a distributed machine learning approach that enhances data privacy by training models across multiple devices or servers without centralizing raw data. Traditional FL frameworks, which rely on synchronous updates and homogeneous resources, face significant performance challenges in real-world deployments. These challenges include handling heterogeneous data distributions (non-IID) and varying computational resources across clients, leading to performance degradation. To address these issues, asynchronous FL frameworks have been proposed, but they introduce new complexities such as communication latency and workload management across geo-distributed servers.

This paper focuses on the impact of network latency and non-IID data distributions on asynchronous multi-server FL systems. We propose and evaluate three methods to mitigate the adverse effects of data heterogeneity on model accuracy: (1) transferring clients between servers; (2) sharing clients among servers so that they alternatively train their models; and (3) sharing clients among servers with model averaging. Our contributions include a reproducible experimental framework for multi-server FL, strategies for optimizing client-server interactions, and an analysis of the effectiveness of these strategies in reducing the impact of non-IID data distributions. Experimental results demonstrate that our methods can reduce training time by up to 85.67%, decrease the number of updates required by 85.82%, and improve accuracy by 4.815% in heterogeneous environments, compared to Spyker, a state-of-the-art multi-server FL algorithm.

Files