Relevance assessment of crowdsourced data (CSD) using semantics and geographic information retrieval (GIR) techniques

Koswatte, Saman and McDougall, Kevin and Liu, Xiaoye (2018) Relevance assessment of crowdsourced data (CSD) using semantics and geographic information retrieval (GIR) techniques. International Journal of Geo-Information, 7 (7). pp. 1-18. ISSN 2220-9964

Text (Published Version)
Available under License Creative Commons Attribution 4.0.

Download (1204Kb) | Preview


Crowdsourced data (CSD) generated by citizens is becoming more popular as its potential utilization in many applications increases due to its currency and availability. However, the quality of CSD, including its relevance, is often questioned as the data is not generated by professionals nor follows standard data-collection procedures. The quality of CSD can be assessed according to a range of characteristics including its relevance. In this paper, information relevance has been explored through using geographic information retrieval (GIR) techniques to identify the most highly relevant information from a set of crowdsourced data. This research tested a relevance assessment approach for CSD by adapting relevance assessment techniques available in the GIR domain. Thematic and geographic relevance were assessed by analyzing the frequency of selected terms which appeared in CSD reports using natural language processing techniques. The study analyzed crowdsourced reports from the 2011 Australian flood’s Crowdmap to examine a proof of concept on relevance assessment using a subset of this dataset based on a defined set of queries. The results determined that the thematic and geographic specificities of the queries were 0.44 and 0.67, respectively, which indicated the queries used were more geographically specific than thematically specific. The Spearman’s rho value of 0.62 indicated that the final ranked relevance lists showed reasonable agreement with a manually classified list and confirmed the potential of the approach for CSD relevance assessment. In particular, this research has contributed to the field of CSD relevance assessment through an integrated thematic and geographic relevance ranking process by using a user-query specificity approach to improve the final ranking.

Statistics for USQ ePrint 34386
Statistics for this ePrint Item
Item Type: Article (Commonwealth Reporting Category C)
Refereed: Yes
Item Status: Live Archive
Additional Information: © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution(CC BY) license (
Faculty/School / Institute/Centre: Current - Faculty of Health, Engineering and Sciences - School of Civil Engineering and Surveying
Date Deposited: 11 Jul 2018 02:07
Last Modified: 21 Sep 2018 05:25
Uncontrolled Keywords: crowdsourced data; relevance; semantics; geographic information retrieval; natural language processing
Fields of Research : 08 Information and Computing Sciences > 0899 Other Information and Computing Sciences > 089999 Information and Computing Sciences not elsewhere classified
Socio-Economic Objective: E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
Identification Number or DOI: 10.3390/ijgi7070256

Actions (login required)

View Item Archive Repository Staff Only