论文标题
探索无代理数据中的分布式知识一致性
Exploring the Distributed Knowledge Congruence in Proxy-data-free Federated Distillation
论文作者
论文摘要
联合学习(FL)是一种隐私的机器学习范式,其中服务器定期从客户端汇总了本地模型参数而无需组装私人数据。 沟通和个性化要求受到限制,对FL构成了严重的挑战。提议联合蒸馏(FD)同时解决上述两个问题,该问题交换了服务器和客户之间的知识,支持异质的本地模型,同时大大减少了通信开销。但是,大多数现有的FD方法都需要一个代理数据集,实际上通常无法使用。 最近的一些无代理DATA FD方法可以消除对其他公共数据的需求,但是由于客户端模型异质性,本地知识之间存在明显差异,从而导致服务器上的含糊不清和准确的准确性降级。 为了解决这个问题,我们提出了一种基于分布式知识一致性(FIDDKC)的无代理FD算法。 FedDKC利用精心设计的改进策略将局部知识差异缩小到可接受的上限,以减轻知识不一致的负面影响。 具体而言,从峰值概率和局部知识的香农熵的角度来看,我们分别设计基于内核的知识精致(KKR)和基于搜索的知识改进(SKR),并且从理论上讲,精制的局部知识可以满足大约相似的分布,并被认为是一致。 在三个通用数据集上进行的广泛实验表明,我们提出的FedDKC在各种异质环境上的最新面积明显优于最先进的,同时显然可以提高收敛速度。
Federated learning (FL) is a privacy-preserving machine learning paradigm in which the server periodically aggregates local model parameters from clients without assembling their private data. Constrained communication and personalization requirements pose severe challenges to FL. Federated distillation (FD) is proposed to simultaneously address the above two problems, which exchanges knowledge between the server and clients, supporting heterogeneous local models while significantly reducing communication overhead. However, most existing FD methods require a proxy dataset, which is often unavailable in reality. A few recent proxy-data-free FD approaches can eliminate the need for additional public data, but suffer from remarkable discrepancy among local knowledge due to client-side model heterogeneity, leading to ambiguous representation on the server and inevitable accuracy degradation. To tackle this issue, we propose a proxy-data-free FD algorithm based on distributed knowledge congruence (FedDKC). FedDKC leverages well-designed refinement strategies to narrow local knowledge differences into an acceptable upper bound, so as to mitigate the negative effects of knowledge incongruence. Specifically, from perspectives of peak probability and Shannon entropy of local knowledge, we design kernel-based knowledge refinement (KKR) and searching-based knowledge refinement (SKR) respectively, and theoretically guarantee that the refined-local knowledge can satisfy an approximately-similar distribution and be regarded as congruent. Extensive experiments conducted on three common datasets demonstrate that our proposed FedDKC significantly outperforms the state-of-the-art on various heterogeneous settings while evidently improving the convergence speed.