Achieving k-Anonymity by clustering in attribute hierarchical structures

Li, Jiuyong and Wong, Raymond Chi-Wing and Fu, Ada Wai-Chee and Pei, Jian (2006) Achieving k-Anonymity by clustering in attribute hierarchical structures. In: 8th International Conference on Data Warehousing and Knowledge Discovery, 4-8 Sept 2006, Krakow, Poland.

Metadata

HTML CitationEndNoteDublin CoreReference Manager

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
224Kb

Official URL: http://dx.doi.org/10.1007/11823728

Abstract

[Abstract]: Individual privacy will be at risk if a published data set is not properly de-identified. k-anonymity is a major technique to de-identify a data set. A more general view of k-anonymity is clustering with a constraint of the minimum number of objects in every cluster. Most existing approaches to achieving k-anonymity by clustering are for numerical (or ordinal) attributes. In this paper, we study achieving k-anonymity by clustering in attribute hierarchical structures. We define generalisation distances between tuples to characterise distortions by generalisations and discuss the properties of the distances. We conclude that the generalisation distance is a metric distance. We propose an efficient clustering-based algorithm for k-anonymisation. We experimentally show that the proposed method is more scalable and causes significantly less distortions than an optimal global recoding k-anonymity method.

Item Type:Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Additional Information:Deposited in accordance with the copyright policy of the publisher. Copyright 2006 Springer. This is the authors' version of the work. It is posted here with permission of the publisher for your personal use. No further distribution is permitted. The item is also available in Lecture Notes in Computer Science v. 4081 at http://www.springerlink.com
Uncontrolled Keywords:data mining; privacy preserving; k-anonymity
Fields of Research (FOR2008):08 Information and Computing Sciences > 0804 Data Format > 080403 Data Structures
08 Information and Computing Sciences > 0806 Information Systems > 080609 Information Systems Management
Subjects:280000 Information, Computing and Communication Sciences
Socio-Economic Objective (SEO2008):UNSPECIFIED
ID Code:2090
Deposited By:
Deposited On:11 Oct 2007 10:57
Last Modified:14 Oct 2011 13:21

Archive Staff Only: edit this record