Semantic labelling for document feature patterns using ontological subjects

Tao, Xiaohui and Li, Yuefeng and Liu, Bin and Shen, Yan (2012) Semantic labelling for document feature patterns using ontological subjects. In: 2012 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2012), 4-7 Dec 2012, Macau, China.


Finding and labelling semantic features patterns of documents in a large, spatial corpus is a challenging problem. Text documents have characteristics that make semantic labelling difficult; the rapidly increasing volume of online documents makes a bottleneck in finding meaningful textual patterns. Aiming to deal with these issues, we propose an unsupervised documnent labelling approach based on semantic content and feature patterns. A world ontology with extensive topic coverage is exploited to supply controlled, structured subjects for labelling. An algorithm is also introduced to reduce dimensionality based on the study of ontological structure. The proposed approach was promisingly evaluated by compared with typical machine learning methods including SVMs, Rocchio, and kNN.

Statistics for USQ ePrint 22942
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Copyright 2012 The Institute of Electrical and Electronics Engineers, Inc. Permanent restricted access to published version due to publisher copyright policy.
Faculty/School / Institute/Centre: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 30 Apr 2013 03:54
Last Modified: 25 Aug 2016 05:21
Uncontrolled Keywords: text classification; document patterns; ontology learning; feature selection; semantic labelling; pattern
Fields of Research : 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
08 Information and Computing Sciences > 0806 Information Systems > 080604 Database Management
08 Information and Computing Sciences > 0807 Library and Information Studies > 080703 Human Information Behaviour
Socio-Economic Objective: E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
Identification Number or DOI: 10.1109/WI-IAT.2012.47

Actions (login required)

View Item Archive Repository Staff Only