Publishing anonymous survey rating data

Sun, Xiaoxun and Wang, Hua and Li, Jiuyong and Pei, Jian (2011) Publishing anonymous survey rating data. Data Mining and Knowledge Discovery , 23 (3). pp. 379-406. ISSN 1384-5810

Abstract

We study the challenges of protecting privacy of individuals in the large public survey rating data in this paper. Recent study shows that personal information in supposedly anonymous movie rating records are de-identified. The survey rating data usually contains both ratings of sensitive and non-sensitive issues. The ratings of sensitive issues involve personal privacy. Even though the survey participants do not reveal any of their ratings, their survey records are potentially identifiable by using information from other public sources. None of the existing anonymisation principles (e.g., k-anonymity, l-diversity, etc.) can effectively prevent such breaches in large survey rating data sets. We tackle the problem by defining a principle called {Mathematical expression}-anonymity model to protect privacy. Intuitively, the principle requires that, for each transaction t in the given survey rating data T, at least (k - 1) other transactions in T must have ratings similar to t, where the similarity is controlled by {Mathematical expression} . The {Mathematical expression} -anonymity model is formulated by its graphical representation and a specific graph-anonymisation problem is studied by adopting graph modification with graph theory. Various cases are analyzed and methods are developed to make the updated graph meet {Mathematical expression} requirements. The methods are applied to two real-life data sets to demonstrate their efficiency and practical utility.


Statistics for USQ ePrint 19359
Statistics for this ePrint Item
Item Type: Article (Commonwealth Reporting Category C)
Refereed: Yes
Item Status: Live Archive
Additional Information: Permenent restricted access to published version due to publisher copyright policy.
Depositing User: epEditor USQ
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 13 Sep 2011 01:06
Last Modified: 03 Jul 2013 00:43
Uncontrolled Keywords: graphical representation; survey rating data; anonymity
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0804 Data Format > 080402 Data Encryption
08 Information and Computing Sciences > 0803 Computer Software > 080303 Computer System Security
08 Information and Computing Sciences > 0806 Information Systems > 080608 Information Systems Development Methodologies
Socio-Economic Objective (SEO2008): E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
Identification Number or DOI: doi: 10.1007/s10618-010-0208-4
URI: http://eprints.usq.edu.au/id/eprint/19359

Actions (login required)

View Item Archive Repository Staff Only