扰动输入以防止模型窃取

论文标题

扰动输入以防止模型窃取

Perturbing Inputs to Prevent Model Stealing

论文作者

Grana, Justin

论文摘要

我们展示了在云中部署的机器学习服务（ML-Service）的扰动输入如何防止模型窃取攻击。在我们的公式中，有一个ML服务，可以从用户接收输入并返回模型的输出。有一个攻击者有兴趣学习ML服务的参数。我们使用线性和逻辑回归模型来说明如何从策略性地向输入添加噪声从根本上改变攻击者的估计问题。我们表明，即使使用无限样本，攻击者也无法恢复真实的模型参数。我们专注于表征攻击者对参数估算中的误差之间的权衡，而ML服务的输出中的错误。

We show how perturbing inputs to machine learning services (ML-service) deployed in the cloud can protect against model stealing attacks. In our formulation, there is an ML-service that receives inputs from users and returns the output of the model. There is an attacker that is interested in learning the parameters of the ML-service. We use the linear and logistic regression models to illustrate how strategically adding noise to the inputs fundamentally alters the attacker's estimation problem. We show that even with infinite samples, the attacker would not be able to recover the true model parameters. We focus on characterizing the trade-off between the error in the attacker's estimate of the parameters with the error in the ML-service's output.

下载PDF全文

下载文献需遵守相关版权规定

论文标题