Sun, Xiaoxun and Wang, Hua and Li, Jiuyong (2011) Validating privacy requirements in large survey rating data. In: Bessis, Nik and Xhafa, Fatos, (eds.) Next generation data technologies for collective computational intelligence. Studies in Computational Intelligence (352). Springer-Verlag Berlin and Heidelberg GmbH , Berlin, Germany, pp. 445-469. ISBN 978-3-642-20343-5
Metadata
| HTML Citation | EndNote | MODS | Dublin Core | Reference Manager |
Full text available as:
| PDF (Documentation) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 750Kb |
Official URL: http://www.springerlink.com/content/m2p1828m61050114/
Identification Number or DOI: doi: 10.1007/978-3-642-20344-2_17
Abstract
Recent study shows that supposedly anonymous movie rating records are de-identified by using a little auxiliary information. In this chapter, we study a problem of protecting privacy of individuals in large public survey rating data. Such rating data usually contains both ratings of sensitive and non-sensitive issues, and the ratings of sensitive issues belong to personal privacy. Even when survey participants do not reveal any of their ratings, their survey records are potentially identifiable by using information from other public sources. To amend this, in this chapter, we propose a novel (k;e ; l)-anonymity model to protect privacy in large survey rating data, in which each survey record is required to be 'similar' with at least k−1 others based on the non-sensitive ratings, where the similarity is controlled by e, and the standard deviation of sensitive ratings is at least l. We study an interesting yet non-trivial satisfaction problem of the proposed model, which is to decide whether a survey rating data set satisfies the privacy requirements given by the user. For this problem, we investigate its inherent properties theoretically, and devise a novel slice technique to solve it. We discuss the idea of how to anonymize data by using the result of satisfaction problem. Finally, we conduct extensive experiments on two real-life data sets, and the results show that the slicing technique is fast and scalable with data size and much more efficient in terms of execution time and space overhead than the heuristic pairwise method.
| Item Type: | Book Chapter (Commonwealth Reporting Category B) |
|---|---|
| Additional Information: | Chapter 17. Author's version deposited with blanket permission of publisher. Print version held in USQ Library at call no. 006.3 Nex. |
| Uncontrolled Keywords: | privacy; requirements; survey rating data |
| Fields of Research (FOR2008): | 08 Information and Computing Sciences > 0806 Information Systems > 080604 Database Management 08 Information and Computing Sciences > 0803 Computer Software > 080303 Computer System Security 08 Information and Computing Sciences > 0806 Information Systems > 080608 Information Systems Development Methodologies |
| Subjects: | UNSPECIFIED |
| Socio-Economic Objective (SEO2008): | B Ecomonic Development > 89 Information and Communication Services > 8902 Computer Software and Services > 890205 Information Processing Services (incl. Data Entry and Capture) |
| ID Code: | 18293 |
| Deposited By: | |
| Deposited On: | 02 Aug 2011 14:36 |
| Last Modified: | 19 Sep 2012 16:13 |
Archive Staff Only: edit this record
