论文标题

分批固定分配估算

Batch Stationary Distribution Estimation

论文作者

Wen, Junfeng, Dai, Bo, Li, Lihong, Schuurmans, Dale

论文摘要

我们考虑了在一组采样过渡的情况下,我们考虑了近似值马尔可夫链的固定分布的问题。基于经典模拟的方法假设访问基础过程,以便可以收集足够长的轨迹以近似固定抽样。取而代之的是,我们考虑了一种替代设置,其中已事先通过单独的,可能是未知的过程收集了固定的过渡。目标仍然是估计固定分布的属性,但没有对基础系统的额外访问。我们提出了一个一致的估计器,该估计器基于恢复给定数据的校正比函数。特别是,我们开发了一种差异功率方法(VPM),该方法在一般条件下提供了可证明一致的估计值。除了统一不同的子字段现有方法外,我们还发现,VPM在包括排队,随机微分方程,后加工MCMC和外部评估在内的一系列问题中产生的估计值明显更好。

We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions. Classical simulation-based approaches assume access to the underlying process so that trajectories of sufficient length can be gathered to approximate stationary sampling. Instead, we consider an alternative setting where a fixed set of transitions has been collected beforehand, by a separate, possibly unknown procedure. The goal is still to estimate properties of the stationary distribution, but without additional access to the underlying system. We propose a consistent estimator that is based on recovering a correction ratio function over the given data. In particular, we develop a variational power method (VPM) that provides provably consistent estimates under general conditions. In addition to unifying a number of existing approaches from different subfields, we also find that VPM yields significantly better estimates across a range of problems, including queueing, stochastic differential equations, post-processing MCMC, and off-policy evaluation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源