课程何时在联邦学习中工作？

论文标题

课程何时在联邦学习中工作？

When Do Curricula Work in Federated Learning?

论文作者

Vahidian, Saeed, Kadaveru, Sreevatsank, Baek, Woonjoon, Wang, Weijia, Kungurtsev, Vyacheslav, Chen, Chen, Shah, Mubarak, Lin, Bill

论文摘要

经常引用的联合学习的开放问题是客户存在数据异质性。理解联合学习的急剧精度下降的一种途径是仔细检查客户对不同级别的“难度”的数据的行为，而这些模型已经没有解决。在本文中，我们研究了FL：有序学习的不同且很少研究的维度。具体而言，我们旨在调查有序学习原则如何有助于减轻FL中的异质性效应。我们介绍了理论分析并进行有关涵盖三种学习的有序效果的广泛经验研究：课程，反疗法和随机课程。我们发现课程学习在很大程度上可以减轻非iid性。有趣的是，跨客户的数据分布越不同，他们从有序学习中受益的越多。我们提供了解释这种现象的分析，特别表明课程培训如何使客观景观逐渐降低凸面，这表明在培训程序开始时进行了快速收敛的迭代。我们通过将联合设备上的课程训练作为局部SGD进行建模，以局部有偏见的随机梯度对凸面和非covex目标的收敛结果得出定量结果。此外，受秩序学习的启发，我们提出了一种新颖的客户选择技术，该技术受益于客户的现实差异。当与FL中的有序学习一起应用时，我们提出的对客户选择方法具有协同作用。

An oft-cited open problem of federated learning is the existence of data heterogeneity at the clients. One pathway to understanding the drastic accuracy drop in federated learning is by scrutinizing the behavior of the clients' deep models on data with different levels of "difficulty", which has been left unaddressed. In this paper, we investigate a different and rarely studied dimension of FL: ordered learning. Specifically, we aim to investigate how ordered learning principles can contribute to alleviating the heterogeneity effects in FL. We present theoretical analysis and conduct extensive empirical studies on the efficacy of orderings spanning three kinds of learning: curriculum, anti-curriculum, and random curriculum. We find that curriculum learning largely alleviates non-IIDness. Interestingly, the more disparate the data distributions across clients the more they benefit from ordered learning. We provide analysis explaining this phenomenon, specifically indicating how curriculum training appears to make the objective landscape progressively less convex, suggesting fast converging iterations at the beginning of the training procedure. We derive quantitative results of convergence for both convex and nonconvex objectives by modeling the curriculum training on federated devices as local SGD with locally biased stochastic gradients. Also, inspired by ordered learning, we propose a novel client selection technique that benefits from the real-world disparity in the clients. Our proposed approach to client selection has a synergic effect when applied together with ordered learning in FL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题