Concept drift detection in histogram-based straightforward data stream prediction

Jorge CASILLAS, Shuo WANG, Xin YAO

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

2 Citations (Scopus)

Abstract

Mainstream research in concept drift detection for on-line classification focuses on monitoring a measure of the learner's performance, thus assuming that a reduction of its prediction ability implies a change in the relation between the input variables and the class labels. This approach makes the detector highly dependent on the learner, which can be problematic in some situations, for example, when the learner includes adaptability mechanisms, when it is unable to converge, or when it shows a high overfitting. Ultimately, the concept drift is something that happens in the data, not learners, so detecting drifts indirectly through a learner's performance adds bias into the result. Besides, it makes the process highly inefficient. This paper proposes a new mechanism to detect concept drifts without supervising the learning process. The data distribution is summarized in univariate histograms based on which an on-line straightforward prediction is made. It proves us a measure that is monitored over time for detecting changes in data. The method is extremely efficient, as the time and space complexity to process each sample is linear to the number of input variables multiplied by the number of classes. A thorough analysis on synthetic and real-world data streams shows that the proposed method very efficiently makes a reliable (low false alarm ratio) and effective (high true detection ratio) drift detection that outperforms other well-known methods. © 2018 IEEE.
Original languageEnglish
Title of host publicationIEEE International Conference on Data Mining Workshops, ICDMW
PublisherIEEE Computer Society
Pages878-885
Number of pages8
Volume2018-November
ISBN (Print)9781538692882
DOIs
Publication statusPublished - Nov 2018
Externally publishedYes

Fingerprint

Dive into the research topics of 'Concept drift detection in histogram-based straightforward data stream prediction'. Together they form a unique fingerprint.

Cite this