King, John D. and Li, Yuefeng and Tao, Xiaohui and Nayak, Richi (2007) Mining world knowledge for analysis of search engine content. Web Intelligence and Agent Systems: an International Journal, 5 (3). pp. 233-253. ISSN 1570-1263
|HTML Citation||EndNote||MODS||Dublin Core||Reference Manager|
Full text not available from this archive.
Official URL: http://wi-consortium.org/wicweb/html/journal.php
Little is known about the content of the major search engines. We present an automatic learning method which trains an ontology with world knowledge of hundreds of different subjects in a three-level taxonomy covering all the documents offered in our university library. We then mine this ontology to find important classification rules, and then use these rules to perform an extensive analysis of the content of the largest general purpose internet search engines in use today. Instead of representing documents and collections as a set of terms, we represent them as a set of subjects, which is a highly efficient representation, leading to a more robust representation of information and a decrease of synonymy.
|Item Type:||Article (Commonwealth Reporting Category C)|
|Additional Information:||Permanent restricted access to paper due to publisher copyright policy.|
|Uncontrolled Keywords:||ontology; hierarchal classification; taxonomy; collection selection; search engines; data mining|
|Fields of Research (FOR2008):||08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining|
08 Information and Computing Sciences > 0807 Library and Information Studies > 080704 Information Retrieval and Web Search
08 Information and Computing Sciences > 0805 Distributed Computing > 080501 Distributed and Grid Systems
|Socio-Economic Objective (SEO2008):||E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences|
|Deposited On:||02 Jan 2012 14:48|
|Last Modified:||01 Aug 2012 13:50|
Archive Staff Only: edit this record