Full Citation
Title: Differentially Private Synthetic Data Generation via GANs
Citation Type: Miscellaneous
Publication Year: 2018
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Generating synthetic data is an attractive method for conducting private analysis of a sensitive dataset. It allows analysts to run their own non-private algorithms on the synthetic dataset without having to pre-specify the analyses they wish to perform. Further, both the dataset and any statistical results can be freely disseminated without incurring additional privacy loss. The goal of synthetic data generation is create data that will perform similarly to the original dataset for many analysis tasks. In this working paper, we propose using a Differentially Private Generative Adversarial Network (DP-GAN) to generate private synthetic data. DP-GANs are a variant of Generative Adversarial Networks that are trained privately. GANs were first proposed by Goodfellow et al. [4], and there has since been a tremendous amount of research employing GANs to generate synthetic data. DP-GANs have recently been used for privately generating clinical trial data [2] and image datasets [8, 9]. We build off of previous work on DP-GANs and add further optimizations to enhance performance on wide variety of data types and analysis tasks. We also propose empirical validation of our algorithm’s performance as future work.
Url: https://www.chriswaites.com/paper.pdf
User Submitted?: No
Authors: Boob, Digvijay; Cummings, Rachel; Kimpara, Dhamma; Tantipongpipat, Uthaipon; Waites, Chris; Zimmerman, Kyle
Publisher: Georgia Institute of Technology
Data Collections: IPUMS USA
Topics: Methodology and Data Collection
Countries: United States