Full Citation
Title: Breaching Privacy Using Data Mining: Removing Noise from Perturbed Data
Citation Type: Journal Article
Publication Year: 2012
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Data perturbation is a sanitization method that helps restrict the disclosure of sensitive information from published data. We present an attack on the privacy of the published data that has been sanitized using data perturbation. The attack employs data mining and fusion to remove some noise from the perturbed sensitive values. Our attack is practical it can be launched by non-expert adversaries having no background knowledge about the perturbed data and no data mining expertise. Moreover, our attack model also allows to consider informed and expert adversaries having background knowledge and/or expertise in data mining and fusion. Extensive experiments were performed on four databases derived from UCIs Adult and IPUMS census-based data sets sanitized with noise addition that satisfies e-differential privacy. The experimental results confirm that our attack presents a significant privacy risk to published perturbed data because the majority of the noise can be effectively removed. The results show that a naive adversary is able to remove around 90% of the noise added during perturbation using general-purpose data miners from the Weka software package, and an informed expert adversary is able to remove 91%99.93% of the added noise. Interestingly, the higher the aimed privacy, the higher the percentage of noise can be removed. This suggests that adding more noise does not always increase the real privacy.
User Submitted?: No
Authors: Sramka, Michal
Periodical (Full): Computational Intelligence for Privacy and Security
Issue:
Volume: 394
Pages: 135-157
Data Collections: IPUMS USA
Topics: Other
Countries: