Mining world knowledge for analysis of search engine content

King, John D. and Li, Yuefeng and Tao, Xiaohui and Nayak, Richi (2007) Mining world knowledge for analysis of search engine content. Web Intelligence and Agent Systems: an International Journal, 5 (3). pp. 233-253. ISSN 1570-1263

Abstract

Little is known about the content of the major search engines. We present an automatic learning method which trains an ontology with world knowledge of hundreds of different subjects in a three-level taxonomy covering all the documents offered in our university library. We then mine this ontology to find important classification rules, and then use these rules to perform an extensive analysis of the content of the largest general purpose internet search engines in use today. Instead of representing documents and collections as a set of terms, we represent them as a set of subjects, which is a highly efficient representation, leading to a more robust representation of information and a decrease of synonymy.


Statistics for USQ ePrint 20109
Statistics for this ePrint Item
Item Type: Article (Commonwealth Reporting Category C)
Refereed: Yes
Item Status: Live Archive
Additional Information: Permanent restricted access to paper due to publisher copyright policy.
Depositing User: Dr Xiaohui (Daniel) Tao
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 02 Jan 2012 04:48
Last Modified: 03 Jul 2013 00:53
Uncontrolled Keywords: ontology; hierarchal classification; taxonomy; collection selection; search engines; data mining
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
08 Information and Computing Sciences > 0807 Library and Information Studies > 080704 Information Retrieval and Web Search
08 Information and Computing Sciences > 0805 Distributed Computing > 080501 Distributed and Grid Systems
Socio-Economic Objective (SEO2008): E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
URI: http://eprints.usq.edu.au/id/eprint/20109

Actions (login required)

View Item Archive Repository Staff Only