Abstract
We study label privacy protection in vertical federated learning (VFL). VFL enables an active party who possesses labeled data to improve model performance (utility) by collaborating with passive parties who have auxiliary features. Recently, there has been a growing concern for protecting label privacy against passive parties who may surreptitiously deduce private labels from the output of their bottom models. In contrast to existing defense methods that focus on training-phase perturbation, we propose a novel offline-phase cleansing approach to protect label privacy barely compromising utility. Specifically, we first formulate a Label Privacy Source Coding (LPSC) problem to remove the redundant label information in the active party’s features from labels, by assigning each sample a new weight and label (i.e., residual) for federated training. We theoretically demonstrate that LPSC 1) satisfies ϵ-mutual information privacy (ϵ-MIP) and 2) can be reduced to gradient boosting’s objective thereby efficiently optimized. Therefore, we propose a gradient boosting-based LPSC method to protect label privacy. Moreover, given that LPSC only provides bounded privacy enhancement, we further introduce the two-phase LPSC+ framework, which enables a flexible privacy-utility trade-off by incorporating training-phase perturbation methods, such as adversarial training. Experimental results on four real-world datasets substantiate the efficacy of LPSC and the superiority of our LPSC+ framework.
Original language | English |
---|---|
Title of host publication | Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2024, Proceedings |
Editors | Albert Bifet, Jesse Davis, Tomas Krilavičius, Meelis Kull, Eirini Ntoutsi, Indrė Žliobaitė |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 313-331 |
Number of pages | 19 |
ISBN (Print) | 9783031703409 |
DOIs | |
Publication status | Published - 22 Aug 2024 |
Event | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 - Vilnius, Lithuania Duration: 9 Sept 2024 → 13 Sept 2024 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 14941 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 |
---|---|
Country/Territory | Lithuania |
City | Vilnius |
Period | 9/09/24 → 13/09/24 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
Keywords
- Mutual information privacy
- Vertical federated learning