Zhang, Ji (2005) Detecting outlying subspaces for high-dimensional data: a heuristic search approach. In: 2005 SIAM International Workshop on Feature Selection for Data Mining: Interfacing Machine Learning and Statistics, 23 April 2005, Newport Beach, California, United States.
PDF (Published Version)
[Abstract]: In this paper, we identify a new task for studying the out-lying degree of high-dimensional data, i.e. finding the sub-spaces (subset of features) in which given points are out-liers, and propose a novel detection algorithm, called High-D Outlying subspace Detection (HighDOD). We measure the outlying degree of the point using the sum of distances between this point and its k nearest neighbors. Heuristic pruning strategies are proposed to realize fast pruning in the subspace search and an efficient dynamic subspace search method with a sample-based learning process has been im- plemented. Experimental results show that HighDOD is efficient and outperforms other searching alternatives such as the naive top-down, bottom-up and random search methods. Points in these sparse subspaces are assumed to be the outliers. While knowing which data points are the outliers can be useful, in many applications, it is more important to identify the subspaces in which a given point is an outlier, which motivates the proposal of a new technique in this paper to handle this new task.
|Item Type:||Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)|
|Additional Information:||No evidence of copyright restrictions.|
|Uncontrolled Keywords:||outlying subspaces, high-dimensional data, Heuristic search, sample-based learning|
|Subjects:||280000 Information, Computing and Communication Sciences|
|Depositing User:||Dr Ji Zhang|
|Date Deposited:||08 Sep 2009 23:49|
|Last Modified:||02 Jul 2013 23:23|
Actions (login required)
|Archive Repository Staff Only|