论文标题

来自加密数据库的数据推断:一种多维订单匹配方法

Data Inference from Encrypted Databases: A Multi-dimensional Order-Preserving Matching Approach

论文作者

Pan, Yanjun, Efrat, Alon, Li, Ming, Wang, Boyang, Quan, Hanyu, Mitchell, Joseph, Gao, Jie, Arkin, Esther

论文摘要

由于对数据隐私的疑虑越来越多,在将数据库存储在不受信任的服务器上之前,数据库已被加密。为了启用对加密数据的搜索操作,已经提出了可搜索的加密技术。代表性方案使用订单保留加密(OPE)来支持加密数据库上的有效布尔查询。然而,最近的作品显示了从OPE加密数据库中推断出明文数据的可能性,仅使用订单保留约束,或与具有相似频率分布的辅助宣传数据集结合使用。到目前为止,此类攻击的有效性仅限于单维密集数据(大多数来自域的值是加密的),但是在高维数据集(例如,空间数据)上实现它仍然很具有挑战性,这些数据通常在本质上很少。在本文中,我们首次研究了对多维加密数据库(以2-D为特殊情况)的数据推断攻击。我们将其作为2D订单保留的匹配问题提出,并探索未加权和加权案例,其中前者仅使用订单信息最大化点数的数量,后者进一步考虑了具有相似频率的点。我们证明了问题是NP-HARD,然后提出了一种贪婪的算法,以及具有近似值保证的多项式时间算法。合成和现实世界数据集的实验结果表明,与先前的1D匹配算法相比,数据恢复速率显着提高。

Due to increasing concerns of data privacy, databases are being encrypted before they are stored on an untrusted server. To enable search operations on the encrypted data, searchable encryption techniques have been proposed. Representative schemes use order-preserving encryption (OPE) for supporting efficient Boolean queries on encrypted databases. Yet, recent works showed the possibility of inferring plaintext data from OPE-encrypted databases, merely using the order-preserving constraints, or combined with an auxiliary plaintext dataset with similar frequency distribution. So far, the effectiveness of such attacks is limited to single-dimensional dense data (most values from the domain are encrypted), but it remains challenging to achieve it on high-dimensional datasets (e.g., spatial data) which are often sparse in nature. In this paper, for the first time, we study data inference attacks on multi-dimensional encrypted databases (with 2-D as a special case). We formulate it as a 2-D order-preserving matching problem and explore both unweighted and weighted cases, where the former maximizes the number of points matched using only order information and the latter further considers points with similar frequencies. We prove that the problem is NP-hard, and then propose a greedy algorithm, along with a polynomial-time algorithm with approximation guarantees. Experimental results on synthetic and real-world datasets show that the data recovery rate is significantly enhanced compared with the previous 1-D matching algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源