Abstract
Federated learning (FL) is a popular privacy-preserving distributed training scheme, where multiple devices collaborate to train machine learning models by uploading local model updates. To improve communication efficiency, over-the-air computation (AirComp) has been applied to FL, which leverages analog modulation to harness the superposition property of radio waves such that numerous devices can upload their model updates concurrently for aggregation. However, the uplink channel noise incurs considerable model aggregation distortion, which is critically determined by the device scheduling and compromises the learned model performance. In this paper, we propose a probabilistic device scheduling framework for over-the-air FL, named PO-FL, to mitigate the negative impact of channel noise, where each device is scheduled according to a certain probability and its model update is reweighted using this probability in aggregation. We prove the unbiasedness of this aggregation scheme and demonstrate the convergence of PO-FL on both convex and non-convex loss functions. Our convergence bounds unveil that the device scheduling affects the learning performance through the communication distortion and global update variance. Based on the convergence analysis, we further develop a channel and gradient-importance aware algorithm to optimize the device scheduling probabilities in PO-FL. Extensive simulation results show that the proposed PO-FL framework with channel and gradient-importance awareness achieves faster convergence and produces better models than baseline methods.
| Original language | English |
|---|---|
| Pages (from-to) | 6905-6920 |
| Number of pages | 16 |
| Journal | IEEE Transactions on Wireless Communications |
| Volume | 23 |
| Issue number | 7 |
| Early online date | 4 Dec 2023 |
| DOIs | |
| Publication status | Published - Jul 2024 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2002-2012 IEEE.
Funding
This work was supported in part by the Hong Kong Research Grants Council under the Areas of Excellence Scheme under Grant AoE/E-601/22-R and in part by the NSFC/Research Grants Council (RGC) Collaborative Research Scheme under Grant CRS_HKUST603/22. An earlier version of this paper was presented in part at the 2023 IEEE 23rd International Conference on Communication Technology (ICCT) [1]. The associate editor coordinating the review of this article and approving it for publication was X. Gong.
Keywords
- channel awareness
- device scheduling
- Federated learning (FL)
- gradient importance
- over-the-air computation (AirComp)