A framework for efficent association rule mining in XML data

Zhang, Ji and Liu, Han and Ling, Tok Wang and Bruckner, Robert and Tjoa, A. Min (2006) A framework for efficent association rule mining in XML data. Journal of Database Management (JDM), 17 (3). pp. 19-40. ISSN 1063-8016

[img]
Preview
Text (Accepted Version)
Zhang_Liu_Ling_Bruckner_Tjoa_JDM_v17n3_AV.pdf

Download (437Kb)

Abstract

[Abstract]: In this paper, we propose a framework, called XAR-Miner, for mining ARs from XML documents efficiently. In XAR-Miner, raw data in the XML document are first preprocessed to transform to either an Indexed XML Tree (IX-tree) or Multi-relational Databases (Multi-DB), depending on the size of XML document and memory constraint of the system, for efficient data selection and AR mining. Concepts that are relevant to the AR mining task are generalized to produce generalized meta-patterns. A suitable metric is devised for measuring the degree of concept generalization in order to prevent under-generalization or over-generalization. Resulting generalized meta-patterns are used to generate large ARs that meet the support and confidence levels. A greedy algorithm is also presented to integrate data selection and large itemset generation to enhance the efficiency of the AR mining process. The experiments conducted show that XAR-Miner is more efficient in performing a large number of AR mining tasks from XML documents than the state-of-the-art method of repetitively scanning through XML documents in order to perform each of the mining tasks.


Statistics for USQ ePrint 5629
Statistics for this ePrint Item
Item Type: Article (Commonwealth Reporting Category C)
Refereed: Yes
Item Status: Live Archive
Additional Information: Deposited with blanket permission of publisher.
Depositing User: Dr Ji Zhang
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 24 Sep 2009 05:26
Last Modified: 27 Sep 2013 03:50
Uncontrolled Keywords: association rule mining, XML data, data transformation and indexing, concept generalization, meta patterns
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
URI: http://eprints.usq.edu.au/id/eprint/5629

Actions (login required)

View Item Archive Repository Staff Only