Feature Attribution Explanation to Detect Harmful Dataset Shift

Ziming WANG, Changwu HUANG, Xin YAO

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

1 Citation (Scopus)

Abstract

Detecting whether a distribution shift has occurred in the dataset is a critical aspect when implementing machine learning models, as even a small shift in the data distribution may largely affect the performance of a machine learning model and thus cause the deployed model to fail. In this work, we focus on detecting harmful dataset shifts, i.e., shifts that are detrimental to the performance of the machine learning model. The existing methods usually detect whether there is a shift between two datasets according to the following framework: first carrying out dimensionality reduction on the datasets, then determining whether dataset shift exists according to the two-sample statistical test(s) on the reduced datasets. The knowledge contained in the model trained on the dataset is not utilized in the above described dataset shift detection framework. To address this, this paper proposes to take advantage of explainable artificial intelligence (XAI) techniques to exploit the knowledge in trained models when detecting harmful dataset shifts. Specifically, we employ the feature attribution explanation (FAE) method to capture the knowledge in the model and combine it with a widely-used two-sample test method, i.e., maximum mean difference (MMD), to detect harmful dataset shifts. The experimental results on more than twenty different shifts in three widely used image datasets demonstrate that the proposed method is more effective in identifying harmful dataset shifts than existing methods. Moreover, experiments on several different models show that the method is robust and effective over different models, i.e., its detection performance is not sensitive to the model used. © 2023 IEEE.
Original languageEnglish
Title of host publication2023 International Joint Conference on Neural Networks (IJCNN)
PublisherInstitute of Electrical and Electronics Engineers Inc.
Volume2023-June
ISBN (Electronic)9781665488679
ISBN (Print)9781665488686
DOIs
Publication statusPublished - 18 Jun 2023
Externally publishedYes

Bibliographical note

This work was supported by the National Natural Science Foundation of China (Grant No. 62250710682), the Guangdong Provincial Key Laboratory (Grant No. 2020B121201001), the Program for Guangdong Introducing Innovative and Entrepreneurial Teams (Grant No.2017ZT07X386), the Shenzhen Science and Technology Program (Grant No.KQTD2016112514355531), and the Research Institute of Trustworthy Autonomous Systems. Corresponding author: Xin Yao ([email protected])

Keywords

  • Dataset Shift
  • Explainable Artificial Intelligence
  • Feature Attribution Explanation
  • Model Robustness

Fingerprint

Dive into the research topics of 'Feature Attribution Explanation to Detect Harmful Dataset Shift'. Together they form a unique fingerprint.

Cite this