Privacy concerns in federated learning have attracted considerable attention recently. In centralized networks, it has been observed that even without directly exchanging raw training data, the exchange of other so-called intermediate parameters such as weights/gradients can stil
...
Privacy concerns in federated learning have attracted considerable attention recently. In centralized networks, it has been observed that even without directly exchanging raw training data, the exchange of other so-called intermediate parameters such as weights/gradients can still potentially reveal private information. However, there has been relatively less research conducted on privacy concerns in decentralized networks.
In this report, we analyze privacy leakage in optimization-based decentralized federated learning, which adopts generally distributed optimization schemes such as ADMM or PDMM in federated learning. By combining local updates with global aggregations, it was proved that optimization-based approaches are more advantageous compared to the traditional average consensus-based approaches, especially in scenarios where the data at the nodes are not independent and identically distributed (non-IID).
We further extend the privacy bound in distributed optimization to the decentralized learning framework. Different from the fact in the centralized learning framework the leaked information is the local gradients of each individual participant at all rounds, we find that in decentralized cases the leaked information is the difference of the local gradients within a certain time interval. Motivated by the gradient inversion in centralized networks, we then design a homogeneous attack to iteratively optimize dummy data whose gradient differences are close to the true revealed gradient differences. Though the gradient difference information still brings privacy concerns, we show that it is more challenging for adversaries to reconstruct private data using the difference of gradients than using the gradients themselves in the centralized case.
To deal with the privacy attack, we propose several potential defense strategies such as early stopping, inexact update and quantization etc. The main advantage of these approaches is that they introduce error/noise/distortion into decentralized federated learning for protecting private information from being revealed to others without affecting the training accuracy. In addition, we also show that the larger the batchsize is, the more difficult for the adversary to reconstruct the private information.