Li, Jiuyong and Wong, Raymond Chi-Wing and Fu, Ada Wai-Chee and Pei, Jian (2006) Achieving k-Anonymity by clustering in attribute hierarchical structures. In: 8th International Conference on Data Warehousing and Knowledge Discovery, 4-8 Sept 2006, Krakow, Poland.
[Abstract]: Individual privacy will be at risk if a published data set is not properly de-identified. k-anonymity is a major technique to de-identify a data set. A more general view of k-anonymity is clustering with a constraint of the minimum number of objects in every cluster. Most existing approaches to achieving k-anonymity by clustering are for numerical (or ordinal) attributes. In this paper, we study achieving k-anonymity by clustering in attribute hierarchical structures. We define generalisation distances between tuples to characterise distortions by generalisations and discuss the properties of the distances. We conclude that the generalisation distance is a metric distance. We propose an efficient clustering-based algorithm for k-anonymisation. We experimentally show that the proposed method is more scalable and causes significantly less distortions than an optimal global recoding k-anonymity method.
Statistics for this ePrint Item
|Item Type:||Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)|
|Item Status:||Live Archive|
|Additional Information:||Deposited in accordance with the copyright policy of the publisher. Copyright 2006 Springer. This is the authors' version of the work. It is posted here with permission of the publisher for your personal use. No further distribution is permitted. The item is also available in Lecture Notes in Computer Science v. 4081 at http://www.springerlink.com|
|Depositing User:||Dr Jiuyong (John) Li|
|Faculty / Department / School:||Historic - Faculty of Sciences - Department of Maths and Computing|
|Date Deposited:||11 Oct 2007 00:57|
|Last Modified:||02 Jul 2013 22:42|
|Uncontrolled Keywords:||data mining; privacy preserving; k-anonymity|
|Fields of Research (FOR2008):||08 Information and Computing Sciences > 0804 Data Format > 080403 Data Structures
08 Information and Computing Sciences > 0806 Information Systems > 080609 Information Systems Management
Actions (login required)
|Archive Repository Staff Only|