Rough sets based reasoning and pattern mining for a two-stage information filtering system

Zhou, Xujuan and Li, Yuefeng and Bruza, Peter and Xu, Yue and Lau, Raymaon Y. K. (2010) Rough sets based reasoning and pattern mining for a two-stage information filtering system. In: 19th ACM International Conference on Information and Knowledge Management (CIKM 2010), 26-30 Oct 2010 , Toronto, ON, Canada.


This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern- based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experiments have been conducted to compare the proposed two-stage filtering (T-SM) model with other possible 'term-based + pattern-based' or 'term-based + term-based' IF models. The results based on the RCV1 corpus show that the T-SM model significantly outperforms other types of 'two-stage' IF models.

Statistics for USQ ePrint 30987
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Permanent restricted access to Published version, in accordance with the copyright policy of the publisher.
Faculty/School / Institute/Centre: No Faculty
Faculty/School / Institute/Centre: No Faculty
Date Deposited: 29 Nov 2017 02:29
Last Modified: 01 Dec 2017 01:26
Uncontrolled Keywords: information filtering; decision; experimentation; theory
Fields of Research (2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
Fields of Research (2020): 46 INFORMATION AND COMPUTING SCIENCES > 4699 Other information and computing sciences > 469999 Other information and computing sciences not elsewhere classified
Identification Number or DOI:

Actions (login required)

View Item Archive Repository Staff Only