论文标题
快速贝叶斯对空间计数数据模型的估计
Fast Bayesian Estimation of Spatial Count Data Models
论文作者
论文摘要
空间计数数据模型用于解释和预测现象的频率,例如地理上不同实体(例如人口普查区或道路段)中的交通事故。这些模型通常是使用贝叶斯马尔可夫链蒙特卡洛(MCMC)仿真方法估算的,但是,该方法在计算上很昂贵,并且不能很好地扩展到大型数据集。机器学习方法的变异贝叶斯(VB)通过将贝叶斯估计作为优化问题而不是模拟问题来解决MCMC的缺点。考虑到VB的所有这些优势,在具有未观察到的参数异质性和空间依赖性的负二项式模型中得出了VB方法的后推断。 Pólya-gamma增强用于处理负二项式可能性的非偶性,并采用了变异分布的综合非物质规范来捕获后依赖性。在蒙特卡洛(Monte Carlo)的一项研究中证明了拟议方法的好处,以及在纽约市人口普查区估算青年行人伤害人数的经验应用。在模拟和经验研究中,在常规的八核处理器上,VB方法比MCMC快45至50倍,同时提供了相似的估计和预测精度。以计算资源的可用性为条件,可以利用所提出的VB方法的令人尴尬的并行结构,以进一步加速其估计多达20次。
Spatial count data models are used to explain and predict the frequency of phenomena such as traffic accidents in geographically distinct entities such as census tracts or road segments. These models are typically estimated using Bayesian Markov chain Monte Carlo (MCMC) simulation methods, which, however, are computationally expensive and do not scale well to large datasets. Variational Bayes (VB), a method from machine learning, addresses the shortcomings of MCMC by casting Bayesian estimation as an optimisation problem instead of a simulation problem. Considering all these advantages of VB, a VB method is derived for posterior inference in negative binomial models with unobserved parameter heterogeneity and spatial dependence. Pólya-Gamma augmentation is used to deal with the non-conjugacy of the negative binomial likelihood and an integrated non-factorised specification of the variational distribution is adopted to capture posterior dependencies. The benefits of the proposed approach are demonstrated in a Monte Carlo study and an empirical application on estimating youth pedestrian injury counts in census tracts of New York City. The VB approach is around 45 to 50 times faster than MCMC on a regular eight-core processor in a simulation and an empirical study, while offering similar estimation and predictive accuracy. Conditional on the availability of computational resources, the embarrassingly parallel architecture of the proposed VB method can be exploited to further accelerate its estimation by up to 20 times.