Abstract
Federated learning (FL) is a distributed learning paradigm that maximizes the potential of data-driven models for edge devices without sharing their raw data. However, devices often have non-independent and identically distributed (non-IID) data, meaning their local data distributions can vary significantly. The heterogeneity in input data distributions across devices, commonly referred to as the feature shift problem, can adversely impact the training convergence and accuracy of the global model. To analyze the intrinsic causes of the feature shift problem, we develop a generalization error bound in FL, which motivates us to propose FedCiR, a client-invariant representation learning framework that enables clients to extract informative and client-invariant features. Specifically, we improve the mutual information term between representations and labels to encourage representations to carry essential classification knowledge, and diminish the mutual information term between the client set and representations conditioned on labels to promote representations of clients to be client-invariant. We further incorporate two regularizers into the FL framework to bound the mutual information terms with an approximate global representation distribution to compensate for the absence of the ground-truth global representation distribution, thus achieving informative and client-invariant feature extraction. To achieve global representation distribution approximation, we propose a data-free mechanism performed by the server without compromising privacy. Extensive experiments demonstrate the effectiveness of our approach in achieving client-invariant representation learning and solving the data heterogeneity issue.
| Original language | English |
|---|---|
| Pages (from-to) | 10509-10522 |
| Number of pages | 14 |
| Journal | IEEE Transactions on Mobile Computing |
| Volume | 23 |
| Issue number | 11 |
| Early online date | 18 Mar 2024 |
| DOIs | |
| Publication status | Published - Nov 2024 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Funding
This work was supported in part by the Hong Kong Research Grants Council under the Areas of Excellence scheme under Grant AoE/E-601/22-R and in part by NSFC/RGC Collaborative Research Scheme under Grant CRS_HKUST603/22.
Keywords
- edge intelligence
- federated learning (FL)
- non-independent and identically distributed (non-IID) data
- Representation learning