On the identity anonymization of high-dimensional rating data

Sun, Xiaoxun and Wang, Hua and Zhang, Yanchun (2012) On the identity anonymization of high-dimensional rating data. Concurrency and Computation: Practice and Experience, 24 (10). pp. 1108-1122. ISSN 1532-0626

Abstract

We study the challenges of protecting the privacy of individuals in a large public survey rating data. The survey rating data usually contains both ratings of sensitive and non-sensitive issues. The ratings of sensitive issues involve personal privacy. Although the survey participants do not reveal any of their ratings, their survey records are potentially identifiable by using information from other public sources. None of the existing anonymization principles (e.g. k-anonymity, l-diversity, etc.) can effectively prevent such breaches in large survey rating data sets. In this paper, we tackle the problem by defining a principle called (k, epsilon, l)-anonymity. The principle requires that, for each transaction t in the given survey rating data T, at least (k − 1) other transactions in T must have ratings similar to t, where the similarity is controlled by ε and the standard deviation of sensitive ratings is at least l. We propose a greedy approach to anonymize the survey rating data that scales almost linearly with the input size, and we apply the method to two real-life data sets to demonstrate their efficiency and practical utility.


Statistics for USQ ePrint 20813
Statistics for this ePrint Item
Item Type: Article (Commonwealth Reporting Category C)
Refereed: Yes
Item Status: Live Archive
Additional Information: © 2011 John Wiley & Sons, Ltd. First published online 24 Mar 2011. Permanent restricted access to paper due to publisher copyright restrictions.
Depositing User: Dr Hua Wang
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 26 Feb 2012 06:23
Last Modified: 14 Oct 2014 04:51
Uncontrolled Keywords: privacy; data anonymization; survey rating data
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0806 Information Systems > 080604 Database Management
08 Information and Computing Sciences > 0803 Computer Software > 080303 Computer System Security
08 Information and Computing Sciences > 0806 Information Systems > 080609 Information Systems Management
Socio-Economic Objective (SEO2008): E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
Identification Number or DOI: doi: 10.1002/cpe.1724
URI: http://eprints.usq.edu.au/id/eprint/20813

Actions (login required)

View Item Archive Repository Staff Only