Label Privacy Source Coding in Vertical Federated Learning

Dashan GAO*, Sheng WAN, Hanlin GU, Lixin FAN, Xin YAO, Qiang YANG

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

We study label privacy protection in vertical federated learning (VFL). VFL enables an active party who possesses labeled data to improve model performance (utility) by collaborating with passive parties who have auxiliary features. Recently, there has been a growing concern for protecting label privacy against passive parties who may surreptitiously deduce private labels from the output of their bottom models. In contrast to existing defense methods that focus on training-phase perturbation, we propose a novel offline-phase cleansing approach to protect label privacy barely compromising utility. Specifically, we first formulate a Label Privacy Source Coding (LPSC) problem to remove the redundant label information in the active party’s features from labels, by assigning each sample a new weight and label (i.e., residual) for federated training. We theoretically demonstrate that LPSC 1) satisfies ϵ-mutual information privacy (ϵ-MIP) and 2) can be reduced to gradient boosting’s objective thereby efficiently optimized. Therefore, we propose a gradient boosting-based LPSC method to protect label privacy. Moreover, given that LPSC only provides bounded privacy enhancement, we further introduce the two-phase LPSC+ framework, which enables a flexible privacy-utility trade-off by incorporating training-phase perturbation methods, such as adversarial training. Experimental results on four real-world datasets substantiate the efficacy of LPSC and the superiority of our LPSC+ framework.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2024, Proceedings
EditorsAlbert Bifet, Jesse Davis, Tomas Krilavičius, Meelis Kull, Eirini Ntoutsi, Indrė Žliobaitė
PublisherSpringer Science and Business Media Deutschland GmbH
Pages313-331
Number of pages19
ISBN (Print)9783031703409
DOIs
Publication statusPublished - 22 Aug 2024
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 - Vilnius, Lithuania
Duration: 9 Sept 202413 Sept 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14941 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
Country/TerritoryLithuania
CityVilnius
Period9/09/2413/09/24

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Keywords

  • Mutual information privacy
  • Vertical federated learning

Fingerprint

Dive into the research topics of 'Label Privacy Source Coding in Vertical Federated Learning'. Together they form a unique fingerprint.

Cite this