Full Citation
Title: Privacy-preserving and Usable Data Publishing and Analysis
Citation Type: Dissertation/Thesis
Publication Year: 2012
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: In the current digital world, data is becoming an increasingly valuable resource and the demand for sharing or releasing data has never been higher. Organizations need to make available versions of the data they collected for business or legal reasons and at the same time they are under strong obligation to protect sensitive information about individuals represented in the dataset. This has motivated fruitful research on data privacy over the past decade and various models have been proposed to address the problem of privacy-preserving data analysis. Initial efforts to ensure privacy of released data are based on syntactic definitions such as k- anonymity while subsequent efforts like differential privacy try to provide a more semantic guarantee. In this thesis we contribute to the research of data privacy from several perspectives. First, we address the issue of data usability by proposing a data model to work with anonymized data. This is based on the observation that data anonymized by syntactic models does not fit naturally within the relational model. The data model we proposed, called LICM, is able to succinctly represent and query anonymized data and in general any uncertain data with cardinality constraints. Second, we study the application of differential privacy to two important data management tasks: releasing spatial data (e.g. GPS coordinates) and mining frequent subgraph patterns from a graph database. These two pieces of work contribute to the research on differential privacy by extending the data types that can be handled by differential privacy from tabular data to location and graph data, which have become more significant with the advancement in mobile computing and social network. Finally, after addressing both syntactic models and differential privacy, we propose a unifying platform to study and compare the empirical privacy-utility trade-off in various privacy models. We propose metrics of empirical privacy and empirical utility and found that in practice, the difference between differential privacy and early syntactic models is less dramatic than previously thought.
Url: https://repository.lib.ncsu.edu/bitstream/handle/1840.16/8605/etd.pdf?sequence=2
User Submitted?: No
Authors: Shen, Entong
Institution: North Carolina State University
Department:
Advisor:
Degree: Doctor of Philosophy
Publisher Location:
Pages: 167
Data Collections: IPUMS USA
Topics: Population Data Science
Countries: United States