Optimizing Federated Incremental Learning : Efficient Malicious Data Removal for Big Data Analytics

Kongyang CHEN, Wengao LI, Jiannong CAO, Bing MI*, Jiaxing SHEN

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

Federated incremental learning facilitates decentralized and continuous model updates across multiple clients, presenting a promising framework for big data analytics in distributed environments. However, the presence of poisoned or malicious data introduces significant challenges, including compromised model performance and system reliability. To tackle these issues, this paper proposes an efficient and resource-aware machine unlearning method tailored for federated incremental learning. The approach utilizes a membership inference attack mechanism to accurately identify poisoned data based on prediction confidence levels. Once detected, a targeted forgetting mechanism is applied, leveraging fine-tuning techniques to erase the influence of the poisoned data while preserving the model’s incremental learning capabilities. By aligning the distributions of poisoned data and third-party datasets, the method achieves reliable unlearning without introducing excessive computational overhead. Extensive experiments conducted on diverse datasets validate the method’s effectiveness, demonstrating a significant reduction in forgetting time (up to 21.05× speedup compared to baseline approaches) while maintaining robust model performance in incremental learning tasks. This work offers a scalable and efficient solution to the data forgetting problem, advancing the reliability and practicality of federated incremental learning in distributed and resource-constrained scenarios.
Original languageEnglish
JournalTsinghua Science and Technology
DOIs
Publication statusE-pub ahead of print - 3 Jul 2025

Funding

This work was supported by National Key R&D Program of China (No. 2023YFC3321300), Guangdong Regional Joint Fund Project (No. 2022A1515110157), Guangdong Basic and Applied Basic Research Project (No. 2025A1515012874), Research Project of Pazhou Lab for Excellent Young Scholars (No. PZL2021KF0024), Foundation of Yunnan Key Laboratory of Service Computing (No. YNSC24115), Guangdong Undergraduate Teaching Quality and Teaching Reform Project, University Research Project of Guangzhou Education Bureau (No. 2024312189), Guangzhou Basic and Applied Basic Research Project (No. SL2024A03J00397).

Keywords

  • Big Data Analytics
  • Federated Learning
  • Incremental Learning
  • Machine Unlearning
  • Data Privacy

Fingerprint

Dive into the research topics of 'Optimizing Federated Incremental Learning : Efficient Malicious Data Removal for Big Data Analytics'. Together they form a unique fingerprint.

Cite this