论文标题

在当地差异隐私约束下,最小值对密度和多项式的最佳拟合优度测试

Minimax optimal goodness-of-fit testing for densities and multinomials under a local differential privacy constraint

论文作者

Lam-Weil, Joseph, Laurent, Béatrice, Loubes, Jean-Michel

论文摘要

寻找保护个人数据的匿名机制是最近机器学习研究的核心。在这里,我们考虑了当地差异隐私约束对拟合优点测试的后果,即评估样品点是否是从固定密度$ f_0 $产生的统计问题。观察结果被隐藏并取代了满足当地差异隐私约束的随机转换。在这种情况下,我们提出了一个测试过程,该过程基于对未观察到的样品的密度$ f $之间的二次距离的估计,并提出了$ f_0 $之间的二次距离。我们在与此测试相关的分离距离上建立了一个上限,并且在非相互作用的隐私下测试的最小值分离速率的匹配下限是$ f_0 $在离散且连续的设置下是均匀的。据我们所知,我们在连续环境中对BESOV球的当地差异隐私约束下提供了第一个最小值最佳测试和相关的私人转换,从而量化了为数据隐私付费的价格。我们还提出了一种适应未知密度的平滑度参数的测试,并保持最小到对数因子的最小值。最后,我们注意到我们的结果可以转化为离散的情况,在这种情况下,概率向量的处理与我们环境中的分段恒定密度相当。这就是为什么我们为连续和离散案例均使用统一设置的原因。

Finding anonymization mechanisms to protect personal data is at the heart of recent machine learning research. Here, we consider the consequences of local differential privacy constraints on goodness-of-fit testing, i.e. the statistical problem assessing whether sample points are generated from a fixed density $f_0$, or not. The observations are kept hidden and replaced by a stochastic transformation satisfying the local differential privacy constraint. In this setting, we propose a testing procedure which is based on an estimation of the quadratic distance between the density $f$ of the unobserved samples and $f_0$. We establish an upper bound on the separation distance associated with this test, and a matching lower bound on the minimax separation rates of testing under non-interactive privacy in the case that $f_0$ is uniform, in discrete and continuous settings. To the best of our knowledge, we provide the first minimax optimal test and associated private transformation under a local differential privacy constraint over Besov balls in the continuous setting, quantifying the price to pay for data privacy. We also present a test that is adaptive to the smoothness parameter of the unknown density and remains minimax optimal up to a logarithmic factor. Finally, we note that our results can be translated to the discrete case, where the treatment of probability vectors is shown to be equivalent to that of piecewise constant densities in our setting. That is why we work with a unified setting for both the continuous and the discrete cases.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源