Detecting outlying subspaces for high-dimensional data: a heuristic search approach

Zhang, Ji (2005) Detecting outlying subspaces for high-dimensional data: a heuristic search approach. In: 2005 SIAM International Workshop on Feature Selection for Data Mining: Interfacing Machine Learning and Statistics, 23 April 2005, Newport Beach, California, United States.

PDF (Published Version)

Download (4Mb)


[Abstract]: In this paper, we identify a new task for studying the out-lying degree of high-dimensional data, i.e. finding the sub-spaces (subset of features) in which given points are out-liers, and propose a novel detection algorithm, called High-D Outlying subspace Detection (HighDOD). We measure the outlying degree of the point using the sum of distances between this point and its k nearest neighbors. Heuristic pruning strategies are proposed to realize fast pruning in the subspace search and an efficient dynamic subspace search
method with a sample-based learning process has been im-
plemented. Experimental results show that HighDOD is efficient and outperforms other searching alternatives such as the naive top-down, bottom-up and random search methods. Points in these sparse subspaces are assumed to be
the outliers. While knowing which data points are the
outliers can be useful, in many applications, it is more
important to identify the subspaces in which a given
point is an outlier, which motivates the proposal of a
new technique in this paper to handle this new task.

Statistics for USQ ePrint 5631
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: No evidence of copyright restrictions.
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 08 Sep 2009 23:49
Last Modified: 02 Jul 2013 23:23
Uncontrolled Keywords: outlying subspaces, high-dimensional data, Heuristic search, sample-based learning
Fields of Research : 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining

Actions (login required)

View Item Archive Repository Staff Only