A novel method for detecting outlying subspaces in high-dimensional databases using genetic algorithm

Zhang, Ji and Gao, Qigang and Wang, Hai (2006) A novel method for detecting outlying subspaces in high-dimensional databases using genetic algorithm. In: 6th IEEE International Conference on Data Mining (ICDM 2006), 18-20 Dec 2006, Hong Kong.

[img] PDF (Published Version)
Zhang_Gao_Wang_ICDM06_PV.pdf

Download (311Kb)

Abstract

[Abstract]: Detecting outlying subspaces is a relatively new research problem in outlier-ness analysis for high-dimensional data. An outlying subspace for a given data point p is the subspace in which p is an outlier. Outlying subspace detection can facilitate a better characterization process for the detected outliers. It can also enable outlier mining for high-dimensional data to be performed more accurately and efficiently. In this paper, we proposed a new method using genetic algorithm paradigm for searching outlying subspaces efficiently. We developed a technique for efficiently computing the lower and upper bounds of the distance between a given point and its kth nearest neighbor in each possible subspace. These bounds are used to speed up the fitness evaluation of the designed genetic algorithm for outlying subspace detection. We also proposed a random sampling technique to further reduce the computation of the genetic algorithm. The optimal number of sampling data is specified to ensure the accuracy of the result. We show that the proposed method is efficient and effective in handling outlying subspace detection problem by a set of experiments conducted on both synthetic and real-life datasets.


Statistics for USQ ePrint 5626
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Published version deposited in accordance with the copyright policy of the publisher. © 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purpose or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Depositing User: Dr Ji Zhang
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 01 Sep 2009 23:32
Last Modified: 02 Jul 2013 23:23
Uncontrolled Keywords: outlying subspaces
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
URI: http://eprints.usq.edu.au/id/eprint/5626

Actions (login required)

View Item Archive Repository Staff Only