论文标题
可扩展的贝叶斯推断,用于应用于大型美国枪声数据的自我兴奋随机过程
Scalable Bayesian inference for self-excitatory stochastic processes applied to big American gunfire data
论文作者
论文摘要
霍克斯的过程及其扩展有效地模拟了自我激发现象,包括地震,病毒大流行,金融交易,神经尖峰火车以及模因通过社交网络的传播。这些随机过程模型在许多经济部门和科学学科中的有用性被过程的计算负担削弱了:似然评估的复杂性在时间和时空霍克斯进程的观察次数方面倍增。我们表明,通过谨慎,可以使用中央和图形处理单元的实现同时将这些计算平行,以实现超过100倍的速度,而不是单核处理。使用简单的自适应大都市束缚计划,我们将高性能计算框架应用于华盛顿特区生成的大枪声数据的贝叶斯分析,从2006年至2019年之间,从而将相同数据的过去分析从10,000以下的分析扩展到了85,000多个观测值。为了鼓励广泛使用,我们提供了Hphawkes,一个开源R软件包,并讨论高级实现和程序设计,以利用在大数据设置中必不可少的计算硬件方面。
The Hawkes process and its extensions effectively model self-excitatory phenomena including earthquakes, viral pandemics, financial transactions, neural spike trains and the spread of memes through social networks. The usefulness of these stochastic process models within a host of economic sectors and scientific disciplines is undercut by the processes' computational burden: complexity of likelihood evaluations grows quadratically in the number of observations for both the temporal and spatiotemporal Hawkes processes. We show that, with care, one may parallelize these calculations using both central and graphics processing unit implementations to achieve over 100-fold speedups over single-core processing. Using a simple adaptive Metropolis-Hastings scheme, we apply our high-performance computing framework to a Bayesian analysis of big gunshot data generated in Washington D.C. between the years of 2006 and 2019, thereby extending a past analysis of the same data from under 10,000 to over 85,000 observations. To encourage wide-spread use, we provide hpHawkes, an open-source R package, and discuss high-level implementation and program design for leveraging aspects of computational hardware that become necessary in a big data setting.