Full Citation
Title: DPWeka: Achieving Differential Privacy in WEKA
Citation Type: Dissertation/Thesis
Publication Year: 2017
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Organizations belonging to the government, commercial, and non-profit industries collect and store large amounts of sensitive data, which include medical, financial, and personal information. They use data mining methods to formulate business strategies that yield high long-term and short-term financial benefits. While analyzing such data, the private information of the individuals present in the data must be protected for moral and legal reasons. Current practices such as redacting sensitive attributes, releasing only the aggregate values, and query auditing do not provide sufficient protection against an adversary armed with auxiliary information. In the presence of additional background information, the privacy protection framework, differential privacy, provides mathematical guarantees against adversarial attacks. Existing platforms for differential privacy employ specific mechanisms for limited applications of data mining. Additionally, widely used data mining tools do not contain differentially private data mining algorithms. As a result, for analyzing sensitive data, the cognizance of differentially private methods is currently limited outside the research community. This thesis examines various mechanisms to realize differential privacy in practice and investigates methods to integrate them with a popular machine learning toolkit, WEKA. We present DPWeka, a package that provides differential privacy capabilities to WEKA, for practical data mining. DPWeka includes a suite of differential privacy preserving algorithms which support a variety of data mining tasks including attribute selection and regression analysis. It has provisions for users to control privacy and model parameters, such as privacy mechanism, privacy budget, and other algorithm specific variables. We evaluate private algorithms on real-world datasets, such as genetic data and census data, to demonstrate the practical applicability of DPWeka.
Url: http://search.proquest.com/docview/1899859024?pq-origsite=gscholar
User Submitted?: No
Authors: Katla, Srinidhi
Institution: University of Arkansas
Department: Computer Science
Advisor: Xintao Wu
Degree: Master of Science
Publisher Location: Fayetteville, AK
Pages:
Data Collections: IPUMS USA
Topics: Other
Countries: