虚拟同质性学习：防御联邦学习中的数据异质性

论文标题

虚拟同质性学习：防御联邦学习中的数据异质性

Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning

论文作者

Tang, Zhenheng, Zhang, Yonggang, Shi, Shaohuai, He, Xin, Han, Bo, Chu, Xiaowen

论文摘要

在联合学习（FL）中，模型性能通常会遭受数据异质性引起的客户漂移，而主流工作则专注于纠正客户漂移。我们提出了一种名为虚拟同质性学习（VHL）的不同方法，以直接“纠正”数据异质性。尤其是，VHL使用一个虚拟均匀的数据集进行FL，该数据集精心制作以满足两个条件：不包含私人信息和可分开的情况。虚拟数据集可以是通过跨客户共享的纯噪声生成的，旨在校准异质客户端的功能。从理论上讲，我们证明VHL可以在自然分布上实现可证明的概括性能。从经验上讲，我们证明了VHL赋予FL具有巨大改善的收敛速度和概括性能。 VHL是使用虚拟数据集解决数据异质性的首次尝试，为FL提供了新的有效手段。

In federated learning (FL), model performance typically suffers from client drift induced by data heterogeneity, and mainstream works focus on correcting client drift. We propose a different approach named virtual homogeneity learning (VHL) to directly "rectify" the data heterogeneity. In particular, VHL conducts FL with a virtual homogeneous dataset crafted to satisfy two conditions: containing no private information and being separable. The virtual dataset can be generated from pure noise shared across clients, aiming to calibrate the features from the heterogeneous clients. Theoretically, we prove that VHL can achieve provable generalization performance on the natural distribution. Empirically, we demonstrate that VHL endows FL with drastically improved convergence speed and generalization performance. VHL is the first attempt towards using a virtual dataset to address data heterogeneity, offering new and effective means to FL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题