Retrieving information from microblog using pattern mining and relevance feedback

Lau, Cher Han and Tao, Xiaohui and Tjondronegoro, Dian and Li, Yuefeng (2012) Retrieving information from microblog using pattern mining and relevance feedback. In: 3rd International Conference on Data and Knowledge Engineering (ICDKE 2012), 21-23 Nov 2012, Fujian, China.

Abstract

Retrieving information from Twitter is always challenging
due to its large volume, inconsistent writing and noise. Most existing information retrieval (IR) and text mining methods focus on term-based approach, but suers from the problems of terms variation such as polysemy and synonymy. This problem deteriorates when such methods are applied on Twitter due to the length limit. Over the years, people have
held the hypothesis that pattern-based methods should perform better than term-based methods as it provides more context, but limited studies have been conducted to support such hypothesis especially in Twitter.
This paper presents an innovative framework to address the issue of performing IR in microblog. The proposed framework discover patterns in tweets as higher level feature to assign weight for low-level features (i.e. terms) based on their distributions in higher level features. We present
the experiment results based on TREC11 microblog dataset and shows that our proposed approach signicantly outperforms term-based methods Okapi BM25, TF-IDF and pattern based methods, using precision, recall and F measures.


Statistics for USQ ePrint 23122
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Series: Lecture Notes in Computer Science, Vol. 7696 Permanent restricted access to published version due to publisher copyright policy.
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 16 Apr 2013 06:07
Last Modified: 10 Apr 2017 01:21
Uncontrolled Keywords: online searching; twitter; searching techniques
Fields of Research : 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080107 Natural Language Processing
08 Information and Computing Sciences > 0807 Library and Information Studies > 080703 Human Information Behaviour
Socio-Economic Objective: E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
Identification Number or DOI: 10.1007/978-3-642-34679-8_15
URI: http://eprints.usq.edu.au/id/eprint/23122

Actions (login required)

View Item Archive Repository Staff Only