TY - GEN
T1 - Robust Deep Learning Models against Semantic-Preserving Adversarial Attack
AU - ZHAO, Yunce
AU - GAO, Dashan
AU - YAO, Yinghua
AU - ZHANG, Zeqi
AU - MAO, Bifei
AU - YAO, Xin
N1 - This research was supported by Huawei Technologies Co., Ltd., Research Institute of Trustworthy Autonomous Systems (RITAS), the Guangdong Provincial Key Laboratory (Grant No. 2020B121201001), and National Natural Science Foundation of China (Grant No. 62250710682).
PY - 2023/6/18
Y1 - 2023/6/18
N2 - Deep learning models can be fooled by small lp-norm adversarial perturbations and natural perturbations in terms of attributes. Although the robustness against each perturbation has been explored, it remains a challenge to address the robustness against joint perturbations effectively. In this paper, we study the robustness of deep learning models against joint perturbations by proposing a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training. Specifically, we introduce an attribute manipulator to generate natural and human-comprehensible perturbations and a noise generator to generate diverse adversarial noises. Based on such combined noises, we optimize both the attribute value and the diversity variable to generate jointly-perturbed samples. For robust training, we adversarially train the deep learning model against the generated joint perturbations. Empirical results on four benchmarks show that the SPA attack causes a larger performance decline with small l∞ norm-ball constraints compared to existing approaches. Furthermore, our SPA-enhanced training outperforms existing defense methods against such joint perturbations. © 2023 IEEE.
AB - Deep learning models can be fooled by small lp-norm adversarial perturbations and natural perturbations in terms of attributes. Although the robustness against each perturbation has been explored, it remains a challenge to address the robustness against joint perturbations effectively. In this paper, we study the robustness of deep learning models against joint perturbations by proposing a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training. Specifically, we introduce an attribute manipulator to generate natural and human-comprehensible perturbations and a noise generator to generate diverse adversarial noises. Based on such combined noises, we optimize both the attribute value and the diversity variable to generate jointly-perturbed samples. For robust training, we adversarially train the deep learning model against the generated joint perturbations. Empirical results on four benchmarks show that the SPA attack causes a larger performance decline with small l∞ norm-ball constraints compared to existing approaches. Furthermore, our SPA-enhanced training outperforms existing defense methods against such joint perturbations. © 2023 IEEE.
KW - Adversarial Examples
KW - Adversarial Perturbation
KW - Natural Perturbation
KW - Robustness
UR - http://www.scopus.com/inward/record.url?scp=85169592337&partnerID=8YFLogxK
U2 - 10.1109/IJCNN54540.2023.10191198
DO - 10.1109/IJCNN54540.2023.10191198
M3 - Conference paper (refereed)
SN - 9781665488686
T3 - Proceedings of ... International Joint Conference on Neural Networks
BT - 2023 International Joint Conference on Neural Networks (IJCNN) Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - International Joint Conference on Neural Networks 2023
Y2 - 18 June 2023 through 23 June 2023
ER -