Building XML data warehouse based on frequent patterns in user queries

Zhang, Ji and Ling, Tok Wang and Bruckner, Robert and Tjoa, A. Min (2003) Building XML data warehouse based on frequent patterns in user queries. In: 5th International Conference on Data Warehousing and Knowledge Discovery (DaWaK'03), 3-5 Sept 2003, Prague, Czech Republic.

[img]
Preview
PDF (Accepted Version)
Zhang_Ling_Bruckner_Tjoa_DaWak'03_AV.pdf

Download (378Kb)

Abstract

[Abstract]: With the proliferation of XML-based data sources available across the Internet, it is increasingly important to provide users with a data warehouse of XML data sources to facilitate decision-making processes. Due to the extremely large amount of XML data available on web, unguided warehousing of XML data turns out to be highly costly and usually cannot well accommodate the users’ needs in XML data acquirement. In this paper, we propose an approach to materialize XML data warehouses based on frequent query patterns discovered from historical queries issued by users. The schemas of integrated XML documents in the warehouse are built using these frequent query patterns represented as Frequent Query Pattern Trees (FreqQPTs). Using hierarchical clustering technique, the integration approach in the data warehouse is flexible with respect to obtaining and maintaining XML documents. Experiments show that the overall processing of the same queries issued against the global schema become much efficient by using the XML data warehouse built than by directly searching the multiple data sources.


Statistics for USQ ePrint 5657
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Author's version deposited in accordance with the copyright policy of the publisher. The original publication is available at www.springerlink.com)
Depositing User: Dr Ji Zhang
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 28 Sep 2009 06:53
Last Modified: 02 Jul 2013 23:23
Uncontrolled Keywords: XML data warehouses; frequent query patterns
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
URI: http://eprints.usq.edu.au/id/eprint/5657

Actions (login required)

View Item Archive Repository Staff Only