Kabir, Md Enamul and Wang, Hua and Zhang, Yanchun (2010) A pairwise-systematic microaggregation for statistical disclosure control. In: ICDM 2010: 10th IEEE International Conference on Data Mining , 14-17 Dec 2010, Sydney, Australia.
Metadata
| HTML Citation | EndNote | Dublin Core | Reference Manager |
Full text available as:
| PDF (Documentation) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 347Kb |
Official URL: http://doi.ieeecomputersociety.org/10.1109/ICDM.2010.111
Identification Number or DOI: doi: 10.1109/ICDM.2010.111
Abstract
Microdata protection in statistical databases has recently become a major societal concern and has been intensively studied in recent years. Statistical Disclosure Control (SDC) is often applied to statistical databases before they are released for public use. Microaggregation for SDC is a family of methods to protect microdata from individual identification. SDC seeks to protect microdata in such a way that can be published and mined without providing any private information that can be linked to specific individuals. Microaggregation works by partitioning the microdata into groups of at least 𝑘 records and then replacing the records in each group with the centroid of the group. An optimal microaggregation method must minimize the information loss resulting from this replacement process. The challenge is how to minimize the information loss during the microaggregation process. This paper presents a pairwise systematic (P-S) microaggregation method to minimize the information loss. The proposed technique simultaneously forms two distant groups at a time with the corresponding similar records together in a systematic way and then anonymized with the centroid of each group individually. The structure of P-S problem is defined and investigated and an algorithm of the proposed problem is developed. The performance of the P-S algorithm is compared against the most recent microaggregation methods. Experimental results show that P-S algorithm incurs less than half information loss than the latest microaggregation methods for all of the test situations.
| Item Type: | Conference or Workshop Item (Commonwealth Reporting Category E) (Paper) |
|---|---|
| Additional Information: | Permanent restricted access to published version due to publisher copyright policy. |
| Uncontrolled Keywords: | privacy; microaggregation; microdata protection; 𝑘-anonymity; disclosure control |
| Fields of Research (FOR2008): | 01 Mathematical Sciences > 0104 Statistics > 010499 Statistics not elsewhere classified 08 Information and Computing Sciences > 0804 Data Format > 080402 Data Encryption 08 Information and Computing Sciences > 0803 Computer Software > 080303 Computer System Security |
| Subjects: | UNSPECIFIED |
| Socio-Economic Objective (SEO2008): | E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences |
| ID Code: | 18230 |
| Deposited By: | |
| Deposited On: | 07 Apr 2011 14:34 |
| Last Modified: | 05 Mar 2012 18:20 |
Archive Staff Only: edit this record
