Hierarchical Reduced-Space Drift Detection Framework for Multivariate Supervised Data Streams

Shuyi ZHANG, Peter TINO, Xin YAO

Research output: Journal PublicationsJournal Article (refereed)peer-review

1 Citation (Scopus)

Abstract

In a streaming environment, the characteristics of the data themselves and their relationship with the labels may change over time. Most drift detection methods for supervised data streams are performance-based, that is, they detect changes only after the classification accuracy deteriorates. This may not be sufficient in many application areas where the reason behind a drift is also important. Another category of drift detectors are data distribution-based detectors. Although they can detect some drifts within the input space, changes affecting only the labelling mechanism cannot be identified. Furthermore, little work is available on drift detection for high-dimensional data streams. In this paper we propose an advanced Hierarchical Reduced-space Drift Detection (HRDD) framework for supervised data streams which captures drifts regardless of their effects on classification performance. This framework suggests monitoring both marginal and class-conditional distributions within a lower-dimensional space specifically relevant to the assigned classification task. Experimental comparisons have demonstrated that HRDD not only achieves high-quality performance on high-dimensional data streams, but also outperforms its competitors in terms of detection recall, precision and F-measure across a wide range of different concept drift types including subtle drifts. © 1989-2012 IEEE.
Original languageEnglish
Pages (from-to)2628-2640
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume35
Issue number3
DOIs
Publication statusPublished - 2021
Externally publishedYes

Keywords

  • Concept drift
  • data stream mining
  • drift detection
  • online learning

Fingerprint

Dive into the research topics of 'Hierarchical Reduced-Space Drift Detection Framework for Multivariate Supervised Data Streams'. Together they form a unique fingerprint.

Cite this