A two-stage text mining model for information filtering

Li, Yuefeng and Zhou, Xujuan and Bruza, Peter and Xu, Yue and Lau, Raymond Y.K. (2008) A two-stage text mining model for information filtering. In: 17th ACM conference on Information and Knowledge Management (CIKM '08), 23-30 Oct 2008, Napa Valley, United States.


Mismatch and overload are the two fundamental issues regarding the effectiveness of information filtering. Both term-based and pattern (phrase) based approaches have been employed to address these issues. However, they all suffer from some limitations with regard to effectiveness. This paper proposes a novel solution that includes two stages: an initial topic filtering stage followed by a stage involving pattern taxonomy mining. The objective of the first stage is to address mismatch by quickly filtering out probable irrelevant documents. The threshold used in the first stage is motivated theoretically. The objective of the second stage is to address overload by apply pattern mining techniques to rationalize the data relevance of the reduced document set after the first stage. Substantial experiments on RCV1 show that the proposed solution achieves encouraging performance.

Statistics for USQ ePrint 29688
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Files associated with this item cannot be displayed due to copyright restrictions.
Faculty / Department / School: Historic - Faculty of Business - School of Management and Marketing
Date Deposited: 23 Nov 2016 01:48
Last Modified: 07 Feb 2018 02:43
Uncontrolled Keywords: nformation Filtering, Text Mining, Decision Rules, Thresh- olds, Weighting Schema
Fields of Research : 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
Identification Number or DOI: 10.1145/1458082.1458218
URI: http://eprints.usq.edu.au/id/eprint/29688

Actions (login required)

View Item Archive Repository Staff Only