神经网络修剪中的会员推断攻击和防御

论文标题

神经网络修剪中的会员推断攻击和防御

Membership Inference Attacks and Defenses in Neural Network Pruning

论文作者

Yuan, Xiaoyong, Zhang, Lan

论文摘要

神经网络修剪一直是减少对资源受限设备使用深神经网络的计算和内存要求的重要技术。大多数现有的研究主要侧重于通过战略性删除微不足道的参数并重新修剪修剪模型来平衡修剪神经网络的稀疏性和准确性。由于记忆的增加而造成了严重的隐私风险，因此尚未调查这种训练样品的这种努力。在本文中，我们对神经网络修剪中的隐私风险进行了首次分析。具体而言，我们研究了神经网络修剪对培训数据隐私的影响，即成员推理攻击。我们首先探讨了神经网络修剪对预测差异的影响，在这些差异过程中，修剪过程不成比例地影响了修剪的模型对成员和非会员的行为。同时，差异的影响甚至以细粒度的方式在不同类别之间有所不同。通过这种分歧，我们提出了对修剪的神经网络的自我发起会员推断攻击。进行了广泛的实验，以严格评估不同修剪方法，稀疏水平和对手知识的隐私影响。拟议的攻击表明，与现有的八次成员推理攻击相比，对修剪模型的攻击性能更高。此外，我们提出了一种新的防御机制，通过基于KL差异距离来缓解预测差异，以保护修剪过程，该距离的预测差异已通过实验证明，可以有效地降低隐私风险，同时保持较狭窄的模型的稀疏性和准确性。

Neural network pruning has been an essential technique to reduce the computation and memory requirements for using deep neural networks for resource-constrained devices. Most existing research focuses primarily on balancing the sparsity and accuracy of a pruned neural network by strategically removing insignificant parameters and retraining the pruned model. Such efforts on reusing training samples pose serious privacy risks due to increased memorization, which, however, has not been investigated yet. In this paper, we conduct the first analysis of privacy risks in neural network pruning. Specifically, we investigate the impacts of neural network pruning on training data privacy, i.e., membership inference attacks. We first explore the impact of neural network pruning on prediction divergence, where the pruning process disproportionately affects the pruned model's behavior for members and non-members. Meanwhile, the influence of divergence even varies among different classes in a fine-grained manner. Enlighten by such divergence, we proposed a self-attention membership inference attack against the pruned neural networks. Extensive experiments are conducted to rigorously evaluate the privacy impacts of different pruning approaches, sparsity levels, and adversary knowledge. The proposed attack shows the higher attack performance on the pruned models when compared with eight existing membership inference attacks. In addition, we propose a new defense mechanism to protect the pruning process by mitigating the prediction divergence based on KL-divergence distance, whose effectiveness has been experimentally demonstrated to effectively mitigate the privacy risks while maintaining the sparsity and accuracy of the pruned models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题