论文标题

关于在对抗环境中数据集水印的有效性

On the Effectiveness of Dataset Watermarking in Adversarial Settings

论文作者

Tekgul, Buse Gul Atli, Asokan, N.

论文摘要

在数据驱动的世界中,数据集构成了重要的经济价值。激励花费时间和金钱来收集和策划数据的数据集所有者会被激励,以确保不会以未授权的方式使用其数据集。当发生这种滥用时,数据集所有者需要技术机制来证明其对数据集的所有权。数据集水印为所有权演示提供了一种方法,进而可以阻止未经授权的使用。在本文中,我们研究了一种最近提出的数据出处方法,即放射性数据,以评估它是否可以用于证明用于训练机器学习(ML)模型的(图像)数据集的所有权。原始论文报告说,放射性数据在白色框设置中有效。我们表明,虽然对于拥有许多类的大型数据集是正确的,但对于类别为低$ $(\ leq 30)$的数据集或每个类别的样本数为低$(\ leq 500)$,它不那么有效。我们还表明,违反直觉,即使没有白色框验证,黑框验证技术对于本文中使用的所有数据集都是有效的。鉴于此观察结果,我们表明可以通过在验证过程中直接使用水印样品来提高对白盒验证的置信度。我们还强调了如果要用于所有权演示,则需要评估放射性数据的鲁棒性,因为它是一种对抗性设置,与出处识别不同。 与数据集水印相比,在最近的文献中,对ML模型水印进行了更广泛的探索。但是,大多数模型水印技术都可以通过模型提取来打败。我们表明,放射性数据可以有效地生存模型提取攻击,从而增加了可以用于ML模型所有权验证可靠的模型提取的可能性。

In a data-driven world, datasets constitute a significant economic value. Dataset owners who spend time and money to collect and curate the data are incentivized to ensure that their datasets are not used in ways that they did not authorize. When such misuse occurs, dataset owners need technical mechanisms for demonstrating their ownership of the dataset in question. Dataset watermarking provides one approach for ownership demonstration which can, in turn, deter unauthorized use. In this paper, we investigate a recently proposed data provenance method, radioactive data, to assess if it can be used to demonstrate ownership of (image) datasets used to train machine learning (ML) models. The original paper reported that radioactive data is effective in white-box settings. We show that while this is true for large datasets with many classes, it is not as effective for datasets where the number of classes is low $(\leq 30)$ or the number of samples per class is low $(\leq 500)$. We also show that, counter-intuitively, the black-box verification technique is effective for all datasets used in this paper, even when white-box verification is not. Given this observation, we show that the confidence in white-box verification can be improved by using watermarked samples directly during the verification process. We also highlight the need to assess the robustness of radioactive data if it were to be used for ownership demonstration since it is an adversarial setting unlike provenance identification. Compared to dataset watermarking, ML model watermarking has been explored more extensively in recent literature. However, most of the model watermarking techniques can be defeated via model extraction. We show that radioactive data can effectively survive model extraction attacks, which raises the possibility that it can be used for ML model ownership verification robust against model extraction.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源