论文标题
稀疏机器学习数据的有效私人存储
Efficient Private Storage of Sparse Machine Learning Data
论文作者
论文摘要
我们考虑在机密机器学习数据的私人分布式存储中保持稀疏性的问题。在许多应用程序中,例如面部识别,机器学习算法中使用的数据由稀疏的矩阵表示,可以有效地存储和处理。但是,维持完美信息理论隐私的机制需要将稀疏矩阵编码为随机密集的矩阵。已经表明,在对存储节点的一些限制下,可以保持稀疏性,而牺牲了完美的信息理论隐私要求,即允许一些信息泄漏。在这项工作中,我们取消了对存储节点的限制,并表明稀疏与可实现的隐私保证之间存在权衡。我们专注于设置非碰撞节点,并构建一个编码方案,该方案将稀疏输入矩阵编码为具有所需稀疏度的矩阵,同时限制信息泄漏。
We consider the problem of maintaining sparsity in private distributed storage of confidential machine learning data. In many applications, e.g., face recognition, the data used in machine learning algorithms is represented by sparse matrices which can be stored and processed efficiently. However, mechanisms maintaining perfect information-theoretic privacy require encoding the sparse matrices into randomized dense matrices. It has been shown that, under some restrictions on the storage nodes, sparsity can be maintained at the expense of relaxing the perfect information-theoretic privacy requirement, i.e., allowing some information leakage. In this work, we lift the restrictions imposed on the storage nodes and show that there exists a trade-off between sparsity and the achievable privacy guarantees. We focus on the setting of non-colluding nodes and construct a coding scheme that encodes the sparse input matrices into matrices with the desired sparsity level while limiting the information leakage.