论文标题
通过与联合功能进行分类器重新培训,对异质和长尾数据进行联合学习
Federated Learning on Heterogeneous and Long-Tailed Data via Classifier Re-Training with Federated Features
论文作者
论文摘要
联合学习(FL)为分布式机器学习任务提供了隐私的解决方案。严重损害FL模型的性能的一个具有挑战性的问题是数据异质性和长尾分布的同时存在,这些分布经常出现在实际的FL应用中。在本文中,我们揭示了一个有趣的事实,即有偏见的分类器是导致全球模型表现不佳的主要因素。在上述发现的激励下,我们提出了一种新颖和隐私的FL方法,用于通过与联合特征(Creff)进行分类器重新培训的分类器进行异质和长尾数据。在联合功能上重新培训的分类器可以产生可比的性能,即以隐私性的方式在实际数据上重新训练,而无需信息泄漏本地数据或集体分布。几个基准数据集的实验表明,所提出的Creff是在异质和长尾数据下获得有前途的FL模型的有效解决方案。最先进的FL方法的比较结果也验证了Creff的优越性。我们的代码可在https://github.com/shangxinyi/creff-fl上找到。
Federated learning (FL) provides a privacy-preserving solution for distributed machine learning tasks. One challenging problem that severely damages the performance of FL models is the co-occurrence of data heterogeneity and long-tail distribution, which frequently appears in real FL applications. In this paper, we reveal an intriguing fact that the biased classifier is the primary factor leading to the poor performance of the global model. Motivated by the above finding, we propose a novel and privacy-preserving FL method for heterogeneous and long-tailed data via Classifier Re-training with Federated Features (CReFF). The classifier re-trained on federated features can produce comparable performance as the one re-trained on real data in a privacy-preserving manner without information leakage of local data or class distribution. Experiments on several benchmark datasets show that the proposed CReFF is an effective solution to obtain a promising FL model under heterogeneous and long-tailed data. Comparative results with the state-of-the-art FL methods also validate the superiority of CReFF. Our code is available at https://github.com/shangxinyi/CReFF-FL.