2024 Symposium Posters

Posters > 2024

Eureka: A General Framework for Black-box Differential Privacy Estimators


PDF

Primary Investigator:
Vassilis Zikas

Project Members
Yu Wei, Vassilis Zikas
Abstract
Differential privacy (DP) is a key tool in privacy-preserving data analysis. Yet it remains challenging for non-privacy-experts to prove the DP of their algorithms. We propose a methodology for domain experts with limited data privacy backgrounds to empirically estimate the privacy of an arbitrary mechanism. Our Eureka moment is a new link---which we prove---between the problems of DP parameter-estimation and Bayes optimal classifiers in ML, which we believe can be of independent interest. Our estimator uses this link to achieve two desirable properties: (1) black-box, i.e., it does not require knowledge of the underlying mechanism, and (2) it has a theoretically-proven accuracy, depending on the underlying classifier used, allowing plug-and-play use of different classifiers. More concretely, motivated by the impossibility of the above task for unrestricted input domains (which we prove), we introduce a natural, application-inspired relaxation of DP which we term relative DP. Intuitively, relative DP defines a mechanism's privacy relative to an input set T, circumventing the above impossibility when T is finite. Importantly, it preserves the key intuitive privacy guarantee of DP while enjoying a number of desirable DP properties---scalability, composition, and robustness to post-processing. We then devise a black-box poly-time $(\epsilon,\delta)$-relative DP estimator for any poly-size T---the first privacy estimator to support mechanisms with large {\em output} spaces while having tight accuracy bounds. As a result of independent interest, we generalize our theory to develop the first Distributional Differential Privacy (DDP) estimator. We benchmark our estimator in a proof-of-concept implementation. First, using kNN as the classifier we show that our method (1) produces a tight, analytically computed $(\epsilon, \delta)$-DP trade-off of low-dimensional Laplace and Gaussian mechanisms---the first to do so, (2) accurately estimates the privacy spectrum of DDP mechanisms, and (3) can verify a DP mechanism's implementations, e.g., Sparse Vector Technique, Noisy Histogram, and Noisy max. Our implementation and experiments demonstrate the potential of our framework, and highlight its computational bottlenecks in estimating DP, e.g., in terms of the size of $\delta$ and the data dimensionality. Our second, neural-network-based instantiation makes a first step in showing that our method can be extended to mechanisms with high-dimensional outputs.