IPUMS.org Home Page

BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: Differentially Private M-band Wavelet-Based Mechanisms in Machine Learning Environments

Citation Type: Miscellaneous

Publication Year: 2019

Abstract: In the post-industrial world, data science and analytics have gained paramount importance regarding digital data privacy. Improper methods of establishing privacy for accessible datasets can compromise large amounts of user data even if the adversary has a small amount of preliminary knowledge of a user. Many researchers have been developing high-level privacy-preserving mechanisms that also retain the statistical integrity of the data to apply to machine learning. Recent developments of differential privacy, such as those in [11], [16], [17], [25], [34], and [35], drastically decrease the probability that an adversary can distinguish the elements in a data set and thus extract user information. In this paper, we develop three privacy-preserving mechanisms with the discrete M-band wavelet transform that embed noise into data. The first two methods (​ LS and ​ LS​ +​) add noise through a "Laplace-Sigmoid" distribution that multiplies Laplace-distributed values with the sigmoid function, and the third method utilizes pseudo-quantum steganography to embed noise into the data. We then show that our mechanisms successfully retain both differential privacy and learnability through statistical analysis in various machine learning environments. Highlights In this paper, we create three different input perturbation stochastic mechanisms that add or embed noise to sensitive datasets. Our mechanisms improve upon traditional noise addition methods, such as the Laplace mechanism and exponential mechanism mentioned in [11], by using the discrete M-band wavelet transform (DMWT) to convert the dataset into a wavelet domain before adding noise. For the first two mechanisms, we combine the Laplace distribution and the sigmoid function to create a complex stochastic function, and we optimize the mechanisms based on the size of the dataset. In the third mechanism, we propose the use of pseudo-quantum steganography to embed noise into a dataset. Due to the nature of the quantum signal, the noisy dataset has an extremely low probability of being correctly denoised by an adversary. While our proposed mechanisms preserve ε-differential privacy, they also maintain the statistical integrity of the datasets. Using five different supervised machine learning environments-logistic regression, support vector machine, support vector regression, classical artificial neural networks, and deep learning-the mechanisms achieve high accuracies in binary classification across multiple datasets. Moreover, our (pseudo-) quantum mechanism is one of the first to use higher computational power to add noise to private data. As data privacy becomes an extremely important issue in our world, and as quantum computing emerges as a major field, our research can link the two branches and shine a light on what data privacy could potentially look like in the future.

Url: https://arxiv.org/abs/2001.00012

User Submitted?: No

Authors: Choi, Kenneth; Lee, Tony

Publisher:

Data Collections: IPUMS USA

Topics: Other

Countries:

IPUMS NHGIS NAPP IHIS ATUS Terrapop