IPUMS.org Home Page

BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss

Citation Type: Journal Article

Publication Year: 2015

Abstract: With the ARX data anonymization tool structured biomedical data can be de-identified using syntactic privacy models, such as k-anonymity. Data is transformed with two methods: (a) generalization of attribute values, followed by (b) suppression of data records. The former method results in data that is well suited for analyses by epidemiologists, while the latter method significantly reduces loss of information. Our tool uses an optimal anonymization algorithm that maximizes output utility according to a given measure. To achieve scalability, existing optimal anonymization algorithms exclude parts of the search space by predicting the outcome of data transformations regarding privacy and utility without explicitly applying them to the input dataset. These optimizations cannot be used if data is transformed with generalization and suppression. As optimal data utility and scalability are important for anonymizing biomedical data, we had to develop a novel method.

Url: http://www.sciencedirect.com/science/article/pii/S1532046415002002

User Submitted?: No

Authors: Kohlmayer, Florian; Prasser, Fabian; Kuhn, Klaus A

Periodical (Full): Journal of Biomedical Informatics

Issue:

Volume: 58

Pages: 37-84

Data Collections: IPUMS Time Use - ATUS, IPUMS Health Surveys - NHIS

Topics: Health, Methodology and Data Collection

Countries: United States

IPUMS NHGIS NAPP IHIS ATUS Terrapop