论文标题
分批固定分配估算
Batch Stationary Distribution Estimation
论文作者
论文摘要
我们考虑了在一组采样过渡的情况下,我们考虑了近似值马尔可夫链的固定分布的问题。基于经典模拟的方法假设访问基础过程,以便可以收集足够长的轨迹以近似固定抽样。取而代之的是,我们考虑了一种替代设置,其中已事先通过单独的,可能是未知的过程收集了固定的过渡。目标仍然是估计固定分布的属性,但没有对基础系统的额外访问。我们提出了一个一致的估计器,该估计器基于恢复给定数据的校正比函数。特别是,我们开发了一种差异功率方法(VPM),该方法在一般条件下提供了可证明一致的估计值。除了统一不同的子字段现有方法外,我们还发现,VPM在包括排队,随机微分方程,后加工MCMC和外部评估在内的一系列问题中产生的估计值明显更好。
We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions. Classical simulation-based approaches assume access to the underlying process so that trajectories of sufficient length can be gathered to approximate stationary sampling. Instead, we consider an alternative setting where a fixed set of transitions has been collected beforehand, by a separate, possibly unknown procedure. The goal is still to estimate properties of the stationary distribution, but without additional access to the underlying system. We propose a consistent estimator that is based on recovering a correction ratio function over the given data. In particular, we develop a variational power method (VPM) that provides provably consistent estimates under general conditions. In addition to unifying a number of existing approaches from different subfields, we also find that VPM yields significantly better estimates across a range of problems, including queueing, stochastic differential equations, post-processing MCMC, and off-policy evaluation.