M3C: Resist Agnostic Attacks by Mitigating Consistent Class Confusion Prior

Xiaowei FU, Fuxiang HUANG, Guoyin WANG, Xinbo GAO, Lei ZHANG*

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

Adversarial attack is a major obstacle to the deployment of deep neural networks (DNNs) for security-sensitive applications. To address these adversarial perturbations, various adversarial defense strategies have been developed, with Adversarial Training (AT) being one of the most effective methods to protect neural networks from adversarial attacks. However, existing AT methods struggle against training-agnostic attacks due to their limited generalizability. This suggests that the AT models lack a unified perspective for various attacks to conduct universal defense. This paper sheds light on a generalizable prior under various attacks: consistent class confusion (3C), i.e., an AT classifier often confuses the predictions between correct and ambiguous classes in a highly similar pattern among diverse attacks. Relying on this latent prior as a bridge between seen and agnostic attacks, we propose a more generalized AT model by mitigating consistent class confusion (M3C) to resist training-agnostic attacks. Specifically, we optimize an Adversarial Confusion Loss (ACL), which is weighted by uncertainty, to distinguish the most confused classes and encourage the AT model to focus on these confused samples. To suppress malignant features affecting correct predictions and producing significant class confusion, we propose a Gradient-Aware Attention (GAA) mechanism to enhance the classification confidence of correct classes and eliminate class confusion. Experiments on multiple benchmarks and network frameworks demonstrate that our M3C model significantly improves the generalization of AT robustness against agnostic attacks. The finding of the 3C prior reveals the potential and possibility for defending against a wide range of attacks, and provides a new perspective to overcome such challenge in this field.
Original languageEnglish
Number of pages18
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Early online date25 Sept 2025
DOIs
Publication statusE-pub ahead of print - 25 Sept 2025
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 1979-2012 IEEE.

Funding

This work was partially supported by National Natural Science Fund of China (62271090, 62221005), National Key R&D Program of China (2021YFB3100800), Chongqing Natural Science Fund (cstc2021jcyjjqX0023), and National Youth Talent Project.

Keywords

  • Adversarial defense
  • Adversarial training
  • Universal robustness
  • Generalizable prior
  • Attack consistency

Fingerprint

Dive into the research topics of 'M3C: Resist Agnostic Attacks by Mitigating Consistent Class Confusion Prior'. Together they form a unique fingerprint.

Cite this