Microdata protection method through microaggregation: a median-based approach

Kabir, Md Enamul and Wang, Hua (2011) Microdata protection method through microaggregation: a median-based approach. Information Security Journal: A Global Perspective, 20 (1). pp. 1-8. ISSN 1939-3555

Metadata

HTML CitationEndNoteMODSDublin CoreReference Manager

Full text available as:

[img]
Preview
PDF (Submitted Version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
140Kb
[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
124Kb

Official URL: http://www.tandfonline.com/toc/uiss20/current

Identification Number or DOI: doi: 10.1080/19393555.2010.515288

Abstract

Microaggregation for statistical disclosure control (SDC) is a family of methods to protect microdata from individual identification. SDC seeks to protect microdata in such a way that they can be published and mined without providing any private information that can be linked to specific individuals. The aim of SDC is to modify the original microdata in such a way that the modified data and the original data are similar. Microaggregation works by partitioning the microdata into groups, also called clusters of at least k records, and then replacing the records in each group with the centroid of the group. In this work we introduce a new microaggregation method where the centroid is considered as median. The new method guarantees that the microaggregated data and the original data are similar by using statistical tests. Another contribution of this work is that we propose a distance metric, called absolute deviation from median (ADM), to evaluate the amount of mutual information among records in microdata. We showed that ADM is always less than the most commonly used measure of distortion called sum of squares of errors (SSE) for any dataset. Thus, ADM causes the least information loss and can be used as a measure of information loss for a microaggregated microdata set.

Item Type:Article (Commonwealth Reporting Category C)
Additional Information:Submitted version deposited in accordance with the copyright policy of the publisher.
Uncontrolled Keywords:privacy; microaggregation; microdata protection; k-anonymity; disclosure control
Fields of Research (FOR2008):01 Mathematical Sciences > 0104 Statistics > 010405 Statistical Theory
08 Information and Computing Sciences > 0804 Data Format > 080402 Data Encryption
08 Information and Computing Sciences > 0803 Computer Software > 080303 Computer System Security
Subjects:UNSPECIFIED
Socio-Economic Objective (SEO2008):E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
ID Code:18279
Deposited By:
Deposited On:19 Jul 2011 11:42
Last Modified:08 Jun 2012 14:23

Archive Staff Only: edit this record