Full Citation
Title: Bayesian Non-parametric Generation of Fully Synthetic Multivariate Categorical Data in the Presence of Structural Zeros
Citation Type: Miscellaneous
Publication Year: 2016
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Statistical agencies are increasingly adopting synthetic data methods for disseminating microdata without compromising the privacy of respondents. Synthesis models for multivariate categorical data need to preserve complex multivariate structures, including impossible combinations of responses, also known as structural zeros. Here we propose the use of a Bayesian non parametric method for generating discrete multivariate synthetic data subject to structural zeros. This method can preserve complex multivariate relationships between variables; can be applied to high dimensional datasets with massive collections of structural zeros; requires minimal tuning from the user; and is computationally efficient. We demonstrate our approach by synthesizing an extract of 17 variables from the 2000 U.S. Census. Our method produces synthetic samples with high analytic utility and low disclosure risk.
Url: http://mypage.iu.edu/~dmanriqu/papers/LCM_Zeros_Synth.pdf
User Submitted?: No
Authors: Manrique-Vallier, Daniel; Hu, Jingchen
Publisher: Indiana University
Data Collections: IPUMS USA
Topics: Methodology and Data Collection
Countries: