Dissertation
Utility-driven approaches to protecting sensitive data
Doctor of Philosophy (Ph.D.), Drexel University
Jun 2024
DOI:
https://doi.org/10.17918/00010631
Abstract
Organizations face increasing pressure to protect the privacy of sensitive data. Although privacy-driven approaches to data protection (e.g. k-anonymity and differential privacy) reduce the chances that sensitive information from data subjects will be leaked, these methods often significantly reduce the utility of data. This dissertation presents work on "utility-driven" approaches to data protection that enable organizations to better maintain the utility of their data while offering competitive privacy protection against reasonable and interpretable privacy attacks. The first paper describes a Bayesian data synthesizer that reduces the probability of identifying anonymous online users while maintaining the relationships between textual content and structured data covariates. The second paper illustrates a machine learning-based swapping method tailored to maintain the features (e.g., trend and autocorrelation) of time series while protecting time series from reidentification. We demonstrate the challenges in achieving accurate forecasts while maintaining privacy, and show that accurate forecasts themselves can be used to reidentify time series. The third paper proposes an automated data synthesizer that is optimized to achieve predefined privacy and utility metrics, which significantly reduces the barriers to implementation for organizations. The results show that this synthesizer produces legally anonymous synthetic data that is competitive with commercial offerings based on utility and privacy metrics. Together, these papers provide empirical evidence on the strengths and weaknesses of utility-driven privacy methods and motivate further study of modern privacy risks and accessible implementations of privacy-preserving methodologies.
Metrics
38 File views/ downloads
31 Record Views
Details
- Title
- Utility-driven approaches to protecting sensitive data
- Creators
- Cameron D. Bale
- Contributors
- Matthew Schneider (Advisor)
- Awarding Institution
- Drexel University
- Degree Awarded
- Doctor of Philosophy (Ph.D.)
- Publisher
- Drexel University; Philadelphia, Pennsylvania
- Number of pages
- xvi, 190 pages
- Resource Type
- Dissertation
- Language
- English
- Academic Unit
- Decision Sciences (and Management Information Systems); Bennett S. LeBow College of Business; Drexel University
- Other Identifier
- 991021882415204721