Expert Systems with Applications, 60, 39–50p. (2016) DOI:10.1016/j.eswa.2016.04.026

Using dynamical systems tools to detect concept drift in data streams

F. G. Da Costa, R. A. Rios, R. F.De Mello

Real-world data streams may change their behaviors along time, what is referred to as concept drift. By detecting those changes, researchers obtain relevant information about the phenomena that produced such streams (e.g. temperatures in a region, bacteria population, disease occurrence, etc.). Many concept drift detection algorithms consider supervised or semi-supervised approaches which tend to be unfeasible when data is collected at high frequencies, due to the difficulties involved in labeling. Complementarily, current studies usually assume data as statistically independent and identically distributed, disregarding any temporal relationship among observations and, consequently, risking the quality of data modeling. In order to tackle both aspects, we employ dynamical system modeling to represent the temporal relationships among data observations and how they modify along time in attempt to detect concept drift. This approach considers Taken's immersion theorem to unfold consecutive windows of data observations into the phase space in attempt to represent and compare time dependencies. From this perspective, we proposed four new concept drift detection algorithms based on the unsupervised machine learning paradigm. The first algorithm builds dendrograms of consecutive phase spaces (every phase space represents the time relationships for the observations contained in a particular data window) and compare them out by using the Gromov-Hausdorff distance, providing enough guarantees to detect concept drifts. The second algorithm employs the Cross Recurrence Plot and the Recurrence Quantification Analysis to detect relevant changes in consecutive phase spaces and warn about relevant data modifications. We also preprocess data windows by considering the Empirical Mode Decomposition method and Mutual Information in attempt to take only the deterministic stream behavior into account. All algorithms were implemented as plugins for the Massive Online Analysis (MOA) software and then compared to well-known algorithms from literature. Results confirm the proposed algorithms were capable of detecting most of the behavior changes, creating few false alarms.

back


Creative Commons License © 2017 SOME RIGHTS RESERVED
The content of this web site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Germany License.

Please note: The abstracts of the bibliography database may underly other copyrights.

Ihr Browser versucht gerade eine Seite aus dem sogenannten Internet auszudrucken. Das Internet ist ein weltweites Netzwerk von Computern, das den Menschen ganz neue Möglichkeiten der Kommunikation bietet.

Da Politiker im Regelfall von neuen Dingen nichts verstehen, halten wir es für notwendig, sie davor zu schützen. Dies ist im beidseitigen Interesse, da unnötige Angstzustände bei Ihnen verhindert werden, ebenso wie es uns vor profilierungs- und machtsüchtigen Politikern schützt.

Sollten Sie der Meinung sein, dass Sie diese Internetseite dennoch sehen sollten, so können Sie jederzeit durch normalen Gebrauch eines Internetbrowsers darauf zugreifen. Dazu sind aber minimale Computerkenntnisse erforderlich. Sollten Sie diese nicht haben, vergessen Sie einfach dieses Internet und lassen uns in Ruhe.

Die Umgehung dieser Ausdrucksperre ist nach §95a UrhG verboten.

Mehr Informationen unter www.politiker-stopp.de.