Skellam混合机制：一种具有差异隐私的联合学习的新方法

论文标题

Skellam混合机制：一种具有差异隐私的联合学习的新方法

Skellam Mixture Mechanism: a Novel Approach to Federated Learning with Differential Privacy

论文作者

Bao, Ergute, Zhu, Yizheng, Xiao, Xiaokui, Yang, Yin, Ooi, Beng Chin, Tan, Benjamin Hong Meng, Aung, Khin Mi Mi

论文摘要

深度神经网络具有记忆基础培训数据的强大能力，这可能是一个严重的隐私问题。解决此问题的一个有效解决方案是训练具有不同隐私的模型，该模型通过向梯度注入随机噪声来提供严格的隐私。本文着重于在多个参与者之间分布敏感数据的情况，这些参与者使用安全的多方计算（MPC）共同培训模型（FL），以确保每个梯度更新的机密性，以及差异隐私的机密性，以避免产生模型中的数据泄漏。在这种情况下，一个主要的挑战是，在深度学习中强制执行DP的通用机制（注入了实用值的噪声）与MPC从根本上不兼容，MPC在参与者中交换有限的场地整数。因此，大多数现有的DP机制都需要相当高的噪声水平，从而导致模型效用差。在此激励的情况下，我们提出了Skellam混合机构（SMM），这是一种在通过FL构建的模型上强制DP的方法。与现有方法相比，SMM消除了以下假设：输入梯度必须是整数值的，因此减少了注入的噪声量以保留DP。此外，由于Skellam分布的良好组成和子采样属性，SMM允许严格的隐私会计，这对于使用DP进行准确的深度学习是关键。 SMM的理论分析是高度不平凡的，尤其是考虑到（i）（i）一般而言，差异私人深度学习的复杂数学以及（ii）两个Skellam分布的混合物相当复杂，据我们所知，在DP文献中尚未研究。在各种实际设置上进行的广泛实验表明，SMM始终如一地超过了现有的解决方案，从而在所得模型的效用方面。

Deep neural networks have strong capabilities of memorizing the underlying training data, which can be a serious privacy concern. An effective solution to this problem is to train models with differential privacy, which provides rigorous privacy guarantees by injecting random noise to the gradients. This paper focuses on the scenario where sensitive data are distributed among multiple participants, who jointly train a model through federated learning (FL), using both secure multiparty computation (MPC) to ensure the confidentiality of each gradient update, and differential privacy to avoid data leakage in the resulting model. A major challenge in this setting is that common mechanisms for enforcing DP in deep learning, which inject real-valued noise, are fundamentally incompatible with MPC, which exchanges finite-field integers among the participants. Consequently, most existing DP mechanisms require rather high noise levels, leading to poor model utility. Motivated by this, we propose Skellam mixture mechanism (SMM), an approach to enforce DP on models built via FL. Compared to existing methods, SMM eliminates the assumption that the input gradients must be integer-valued, and, thus, reduces the amount of noise injected to preserve DP. Further, SMM allows tight privacy accounting due to the nice composition and sub-sampling properties of the Skellam distribution, which are key to accurate deep learning with DP. The theoretical analysis of SMM is highly non-trivial, especially considering (i) the complicated math of differentially private deep learning in general and (ii) the fact that the mixture of two Skellam distributions is rather complex, and to our knowledge, has not been studied in the DP literature. Extensive experiments on various practical settings demonstrate that SMM consistently and significantly outperforms existing solutions in terms of the utility of the resulting model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题