分散的随机梯度Langevin Dynamics和Hamiltonian Monte Carlo

论文标题

分散的随机梯度Langevin Dynamics和Hamiltonian Monte Carlo

Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo

论文作者

Gürbüzbalaban, Mert, Gao, Xuefeng, Hu, Yuanhan, Zhu, Lingjiong

论文摘要

随机梯度Langevin动力学（SGLD）和随机梯度哈密顿蒙特卡洛（SGHMC）是贝叶斯推理的两种流行的马尔可夫链蒙特卡洛（MCMC）算法，可以扩展到大型数据集，从而可以从统计模型的后端分布中示例，从而使统计模型的后端分配均可分配模型和先前的模型。但是，这些算法不适用于分散的学习设置，当一个代理网络正在协作以学习统计模型的参数而没有共享其个人数据，因为隐私原因或通信限制。我们研究两种算法：分散的SGLD（DE-SGLD）和分散的SGHMC（DE-SGHMC），它们是SGLD和SGHMC方法的适应，这些方法允许在大数据集的分散设置中可扩展的贝叶斯推论。我们表明，当后验分布强烈地对数符号和光滑时，如果适当地选择了它们的参数，这些算法的迭代将线性收敛至2-wasserstein距离中的目标分布的邻域。我们说明了算法对分散的贝叶斯线性回归和贝叶斯逻辑回归问题的效率。

Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters. However, these algorithms do not apply to the decentralized learning setting, when a network of agents are working collaboratively to learn the parameters of a statistical model without sharing their individual data due to privacy reasons or communication constraints. We study two algorithms: Decentralized SGLD (DE-SGLD) and Decentralized SGHMC (DE-SGHMC) which are adaptations of SGLD and SGHMC methods that allow scaleable Bayesian inference in the decentralized setting for large datasets. We show that when the posterior distribution is strongly log-concave and smooth, the iterates of these algorithms converge linearly to a neighborhood of the target distribution in the 2-Wasserstein distance if their parameters are selected appropriately. We illustrate the efficiency of our algorithms on decentralized Bayesian linear regression and Bayesian logistic regression problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题