# Test without Trust: Optimal Locally Private Distribution Testing

@inproceedings{Acharya2019TestWT, title={Test without Trust: Optimal Locally Private Distribution Testing}, author={Jayadev Acharya and Cl{\'e}ment L. Canonne and Cody R. Freitag and Himanshu Tyagi}, booktitle={AISTATS}, year={2019} }

We study the problem of distribution testing when the samples can only be accessed using a locally differentially private mechanism and focus on two representative testing questions of identity (goodness-of-fit) and independence testing for discrete distributions. First, we construct tests that use existing, general-purpose locally differentially private mechanisms such as the popular RAPPOR or the recently introduced Hadamard Response for collecting data and show that our proposed tests are… Expand

#### Tables and Topics from this paper

#### 44 Citations

Differentially Private Testing of Identity and Closeness of Discrete Distributions

- Computer Science, Mathematics
- NeurIPS
- 2018

The fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over $k$ elements, under differential privacy are studied, and Le Cam's two point theorem is used to provide a general mechanism for proving lower bounds. Expand

Communication-Constrained Inference and the Role of Shared Randomness

- Computer Science
- ICML
- 2019

This work proposes a public-coin protocol that outperforms simulate-and-infer for distribution testing and is, in fact, sample-optimal for distribution learning and examines the role of shared randomness as a resource. Expand

Inference Under Information Constraints II: Communication Constraints and Shared Randomness

- Computer Science, Mathematics
- IEEE Transactions on Information Theory
- 2020

This work proposes a general-purpose simulate-and-infer strategy that uses only private-coin communication protocols and is sample-optimal for distribution learning, and proposes a public-coin protocol that outperforms simulate- and-Infer for distribution testing and is, in fact, sample- optimal. Expand

The structure of optimal private tests for simple hypotheses

- Computer Science, Mathematics
- STOC
- 2019

Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simple… Expand

Inference Under Information Constraints III: Local Privacy Constraints

- Computer Science, Mathematics
- IEEE Journal on Selected Areas in Information Theory
- 2021

It is shown that the availability of shared (public) randomness greatly reduces the sample complexity and under the notion of local differential privacy, simple, sample-optimal, and communication-efficient protocols are proposed for these two questions in the noninteractive setting. Expand

Goodness-of-fit testing for H\"older continuous densities under local differential privacy

- Mathematics
- 2021

We address the problem of goodness-of-fit testing for Hölder continuous densities under local differential privacy constraints. We study minimax separation rates when only noninteractive privacy… Expand

Locally private non-asymptotic testing of discrete distributions is faster using interactive mechanisms

- Mathematics, Computer Science
- NeurIPS
- 2020

It is proved that general information theoretical bounds that allow us to establish the optimality of the authors' algorithms among all pairs of privacy mechanisms and test procedures, in most usual cases, are correct. Expand

Private Testing of Distributions via Sample Permutations

- Computer Science
- NeurIPS
- 2019

The framework of property testing is used to design algorithms to test the properties of the distribution that the data is drawn from with respect to differential privacy, which indicates that differential privacy can be obtained in most regimes of parameters for free. Expand

Successive Refinement of Privacy

- Computer Science, Mathematics
- IEEE Journal on Selected Areas in Information Theory
- 2020

This work provides (order-wise) tight characterizations of privacy-utility-randomness trade-offs in several cases for distribution estimation, including the standard LDP setting under a randomness constraint, and provides a non-trivial privacy mechanism for multi-level privacy. Expand

Private Two-Terminal Hypothesis Testing

- Computer Science, Mathematics
- 2020 IEEE International Symposium on Information Theory (ISIT)
- 2020

It is shown that, in general, meaningful correctness and privacy cannot be achieved if the users do not have access to correlated (but, not common) randomness, and the optimal correctness andPrivacy error exponents are characterized when the users has access to non-trivial correlated randomness. Expand

#### References

SHOWING 1-10 OF 43 REFERENCES

Locally Private Hypothesis Testing

- Computer Science
- ICML
- 2018

This work discusses the randomized-response mechanism and shows that, in essence, it maps the null- and alternative-hypotheses onto new sets, an affine translation of the original sets, and gives sample complexity bounds for identity and independence testing under randomized- response. Expand

Differentially Private Testing of Identity and Closeness of Discrete Distributions

- Computer Science, Mathematics
- NeurIPS
- 2018

The fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over $k$ elements, under differential privacy are studied, and Le Cam's two point theorem is used to provide a general mechanism for proving lower bounds. Expand

Revisiting Differentially Private Hypothesis Tests for Categorical Data

- Computer Science
- 2015

A modified equivalence between chi-squared tests and likelihood ratio tests is shown, more suited to hypothesis testing with privacy, and differentially private likelihood ratio and chi-Squared tests for a variety of applications on tabular data are developed. Expand

Local Private Hypothesis Testing: Chi-Square Tests

- Mathematics, Computer Science
- ICML
- 2018

This work analyzes locally private chi-square tests for goodness of fit and independence testing, which have been studied in the traditional, curator model for differential privacy, to explore the design of private hypothesis tests in the local model. Expand

Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing

- Psychology, Mathematics
- ICML 2016
- 2016

Hypothesis testing is a useful statistical tool in determining whether a given model should be rejected based on a sample from the population. Sample data may contain sensitive information about… Expand

Differentially Private Identity and Closeness Testing of Discrete Distributions

- Computer Science, Mathematics
- ArXiv
- 2017

The theoretical results show that there exist private identity and closeness testers that are nearly as sample-efficient as their non-private counterparts, and an experimental evaluation of the algorithms on synthetic data shows that they achieve small type I and type II errors with sample size sublinear in the domain size of the underlying distributions. Expand

Minimax Optimal Procedures for Locally Private Estimation

- Computer Science, Mathematics
- ArXiv
- 2016

Private versions of classical information-theoretical bounds, in particular those due to Le Cam, Fano, and Assouad, are developed to allow for a precise characterization of statistical rates under local privacy constraints and the development of provably (minimax) optimal estimation procedures. Expand

RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response

- Computer Science
- CCS
- 2014

This paper describes and motivates RAPPOR, details its differential-privacy and utility guarantees, discusses its practical deployment and properties in the face of different attack models, and gives results of its application to both synthetic and real-world data. Expand

A New Approach for Testing Properties of Discrete Distributions

- Computer Science, Mathematics
- 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)
- 2016

The sample complexity of the algorithm depends on the structure of the unknown distributions - as opposed to merely their domain size - and is significantly better compared to the worst-case optimal L1-tester in many natural instances. Expand

Discrete Distribution Estimation under Local Privacy

- Mathematics, Computer Science
- ICML
- 2016

New mechanisms are presented, including hashed K-ary Randomized Response (KRR), that empirically meet or exceed the utility of existing mechanisms at all privacy levels and demonstrate the order-optimality of KRR and the existing RAPPOR mechanism at different privacy regimes. Expand