Detecting projected outliers in high-dimensional data streams

Zhang, Ji and Gao, Qigang and Wang, Hai and Liu, Qing and Xu, Kai (2009) Detecting projected outliers in high-dimensional data streams. In: DEXA 2009: 20th International Conference on Database and Expert Systems Applications, 31 Aug- 4Sep 2009, Linz, Austria.

Metadata

HTML CitationEndNoteDublin CoreReference Manager

Full text available as:

[img]
Preview
PDF (Accepted Version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
279Kb

Official URL: http://www.informatik.uni-trier.de/~ley/db/conf/dexa/index.html

Identification Number or DOI: doi: 10.1007/978-3-642-03573-9_53

Abstract

In this paper, we study the problem of projected outlier detection in high dimensional data streams and propose a new technique, called Stream Projected Ouliter deTector (SPOT), to identify outliers embedded in subspaces. Sparse Subspace Template (SST), a set of subspaces obtained by unsupervised and/or supervised learning processes, is constructed in SPOT to detect projected outliers effectively. Multi-Objective Genetic Algorithm (MOGA) is employed as an effective search method for finding outlying subspaces from training data to construct SST. SST is able to carry out online self-evolution in the detection stage to cope with dynamics of data streams. The experimental results demonstrate the efficiency and effectiveness of SPOT in detecting outliers in high-dimensional data streams.

Item Type:Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Additional Information:Author's version deposited in accordance with the copyright policy of the publisher. The original publication is available at www.springerlink.com)
Uncontrolled Keywords:stream projected outlier deTector; SPOT; outlier detection; atmospheric temperature; clustering algorithms; data communication systems; database systems; detectors
Fields of Research (FOR2008):08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
08 Information and Computing Sciences > 0806 Information Systems > 080604 Database Management
08 Information and Computing Sciences > 0802 Computation Theory and Mathematics > 080201 Analysis of Algorithms and Complexity
Subjects:280000 Information, Computing and Communication Sciences
Socio-Economic Objective (SEO2008):E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
ID Code:5620
Deposited By:
Deposited On:10 Sep 2009 09:53
Last Modified:22 Feb 2012 13:07

Archive Staff Only: edit this record