On efficient and effective association rule mining from XML data

Zhang, Ji and Ling, Tok Wang and Bruckner, Robert and Tjoa, A. Min and Liu, Han (2004) On efficient and effective association rule mining from XML data. In: 15th International Conference on Database and Expert Systems Applications (DEXA'04), 30 August - 3 Sept 2004, Zaragoza, Spain.

[img]
Preview
PDF (Accepted Version)
Zhang_Ling_Bruckner_Tjoa_Liu_Dexa'04_AV.pdf

Download (378Kb)

Abstract

[Abstract]: In this paper, we propose a framework, called XAR-Miner, for mining ARs from XML documents efficiently and effectively. In XAR-Miner, raw XML data are first transformed to either an Indexed Content Tree (IX-tree) or Multirelational databases (Multi-DB), depending on the size of XML document and memory constraint of the system, for efficient data selection in the AR mining. Concepts that are relevant to the AR mining task are generalized to produce generalized meta-patterns. A suitable metric is devised for measuring the degree of concept generalization in order to prevent under-generalization or overgeneralization. Resultant generalized meta-patterns are used to generate large ARs that meet the support and confidence levels. An efficient AR mining algorithm is also presented based on candidate AR generation in the hierarchy of generalized meta-patterns. The experiments show that XAR-Miner is more efficient in performing a large number of AR mining tasks from XML documents than the state-of-the-art method of repetitively scanning through XML documents in order to perform each of the mining tasks.


Statistics for USQ ePrint 5656
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Author's version deposited in accordance with the copyright policy of the publisher. The original publication is available at www.springerlink.com)
Depositing User: Dr Ji Zhang
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 08 Sep 2009 05:31
Last Modified: 02 Jul 2013 23:23
Uncontrolled Keywords: association rule mining, XML data, meta-patterns
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
URI: http://eprints.usq.edu.au/id/eprint/5656

Actions (login required)

View Item Archive Repository Staff Only