Zhang, Ji and Gao, Qigang and Wang, Hai (2008) SPOT: a system for detecting projected outliers from high-dimensional data streams. In: 24th IEEE International Conference on Data Engineering (ICDE 2008) , 7-12 Apr 2008, Cancun, Mexico .
Metadata
| HTML Citation | EndNote | Dublin Core | Reference Manager |
Full text available as:
| PDF (Accepted Version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 79Kb |
Official URL: http://www.icde2008.org/
Identification Number or DOI: doi: 10.1109/ICDE.2008.4497638
Abstract
In this paper, we present a new technique, called Stream Projected Outlier deTector (SPOT), to deal with outlier detection problem in high-dimensional data streams. SPOT is unique in a number of aspects. First, SPOT employs a novel window-based time model and decaying cell summaries to capture statistics from the data stream. Second, Sparse Subspace Template (SST), a set of top sparse subspaces obtained by unsupervised and/or supervised learning processes, is constructed in SPOT to detect projected outliers effectively. Multi-Objective Genetic Algorithm (MOGA) is employed as an effective search method in unsupervised learning for finding outlying subspaces from training data. Finally, SST is able to carry out online self-evolution to cope with dynamics of data streams. This paper provides details on the motivation and technical challenges of detecting outliers from high-dimensional data streams, present an overview of SPOT, and give the plans for system demonstration of SPOT.
| Item Type: | Conference or Workshop Item (Commonwealth Reporting Category E) (Paper) |
|---|---|
| Additional Information: | Accepted version deposited in accordance with the copyright policy of the publisher. © 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purpose or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. |
| Uncontrolled Keywords: | Stream Projected Outlier deTector; SPOT; outlier detection; data engineering; data streaming; high-dimensional data; multi-objective genetic algorithm; outlier detection |
| Fields of Research (FOR2008): | 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining 08 Information and Computing Sciences > 0802 Computation Theory and Mathematics > 080201 Analysis of Algorithms and Complexity 08 Information and Computing Sciences > 0804 Data Format > 080403 Data Structures |
| Subjects: | 280000 Information, Computing and Communication Sciences |
| Socio-Economic Objective (SEO2008): | E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences |
| ID Code: | 5624 |
| Deposited By: | |
| Deposited On: | 02 Sep 2009 09:31 |
| Last Modified: | 18 Jun 2012 14:47 |
Archive Staff Only: edit this record
