论文标题

通过校准的工作证明增加模型提取成本

Increasing the Cost of Model Extraction with Calibrated Proof of Work

论文作者

Dziedzic, Adam, Kaleem, Muhammad Ahmad, Lu, Yu Shen, Papernot, Nicolas

论文摘要

在模型提取攻击中,对手可以通过反复查询并根据获得的预测来窃取通过公共API暴露的机器学习模型。为了防止模型窃取,现有的防御措施专注于检测恶意查询,截断或扭曲输出,因此必然会为合法用户引入鲁棒性和模型实用程序之间的权衡。取而代之的是,我们建议通过要求用户在阅读模型的预测之前完成工作证明来阻碍模型提取。这通过大大增加(甚至高达100倍)来阻止攻击者,以利用查询访问模型提取所需的计算工作。由于我们校准完成每个查询的工作证明所需的努力,因此这仅为常规用户(最多2倍)引入一个微小的开销。为了实现这一目标,我们的校准应用了来自差异隐私的工具来衡量查询所揭示的信息。我们的方法不需要对受害者模型进行任何修改,可以通过机器学习从业人员来使用公开暴露的模型免于被偷走。

In model extraction attacks, adversaries can steal a machine learning model exposed via a public API by repeatedly querying it and adjusting their own model based on obtained predictions. To prevent model stealing, existing defenses focus on detecting malicious queries, truncating, or distorting outputs, thus necessarily introducing a tradeoff between robustness and model utility for legitimate users. Instead, we propose to impede model extraction by requiring users to complete a proof-of-work before they can read the model's predictions. This deters attackers by greatly increasing (even up to 100x) the computational effort needed to leverage query access for model extraction. Since we calibrate the effort required to complete the proof-of-work to each query, this only introduces a slight overhead for regular users (up to 2x). To achieve this, our calibration applies tools from differential privacy to measure the information revealed by a query. Our method requires no modification of the victim model and can be applied by machine learning practitioners to guard their publicly exposed models against being easily stolen.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源