论文标题
时空疾病映射的大问题:方法和软件
Big problems in spatio-temporal disease mapping: methods and software
论文作者
论文摘要
在癌症流行病学等许多领域,适合面积数据的时空模型至关重要。但是,当数据集非常大时,会出现许多问题。本文的主要目的是提出一种一般程序,以分析高维时空计数数据,并特别强调死亡率/发病率相对风险估计。我们提出了一个务实而简单的想法,该想法允许在小区域的数量非常大时拟合等级时空模型。模型拟合是在空间域的分区上使用集成的嵌套拉普拉斯近似值进行的。我们还使用平行和分布式策略来加快贝叶斯模型拟合通常过时甚至不可行的环境。使用模拟和真实数据,我们表明我们的方法表现优于经典的全局模型。我们在开源R软件包BigDM中实现了开发的方法和算法,其中包括特定的小插曲,以促进非专家用户使用该方法。我们可扩展的方法论提案在拟合贝叶斯分层时空模型的高维数据时提供了可靠的风险估计。
Fitting spatio-temporal models for areal data is crucial in many fields such as cancer epidemiology. However, when data sets are very large, many issues arise. The main objective of this paper is to propose a general procedure to analyze high-dimensional spatio-temporal count data, with special emphasis on mortality/incidence relative risk estimation. We present a pragmatic and simple idea that permits to fit hierarchical spatio-temporal models when the number of small areas is very large. Model fitting is carried out using integrated nested Laplace approximations over a partition of the spatial domain. We also use parallel and distributed strategies to speed up computations in a setting where Bayesian model fitting is generally prohibitively time-consuming and even unfeasible. Using simulated and real data, we show that our method outperforms classical global models. We implement the methods and algorithms that we develop in the open-source R package bigDM where specific vignettes have been included to facilitate the use of the methodology for non-expert users. Our scalable methodology proposal provides reliable risk estimates when fitting Bayesian hierarchical spatio-temporal models for high-dimensional data.