论文标题
弥合训练和推断时空预测之间的差距
Bridging the Gap Between Training and Inference for Spatio-Temporal Forecasting
论文作者
论文摘要
时空序列预测是时空数据挖掘的基本任务之一。它促进了许多现实世界的应用,例如降水,全市人群流动预测和空气污染预测。最近,已经提出了一些基于SEQ2SEQ的方法,但是SEQ2SEQ模型的缺点之一是,由于训练和推理阶段的不同分布,小错误可以在推理阶段迅速沿生成的序列迅速积累。这是因为SEQ2SEQ模型仅在训练过程中最小化单步误差,但是必须在推理阶段生成整个序列,这在训练和推理之间产生差异。在这项工作中,我们提出了一种基于课程学习的新型策略,名为“暂时渐进式增长”采样,以有效地弥合训练和推断时空序列预测之间的差距,通过将培训过程从完全监督的方式转变为利用所有可用的基础价值,从而替代了较低的预测方式,从而替代了某些构成底层的上下文。为此,我们通过经过精心设计的衰减策略从中间模型中的中间模型中的中间输出进行了对目标序列的采样。实验结果表明,我们提出的方法更好地模型的长期依赖性和优于两个竞争数据集的基线方法。
Spatio-temporal sequence forecasting is one of the fundamental tasks in spatio-temporal data mining. It facilitates many real world applications such as precipitation nowcasting, citywide crowd flow prediction and air pollution forecasting. Recently, a few Seq2Seq based approaches have been proposed, but one of the drawbacks of Seq2Seq models is that, small errors can accumulate quickly along the generated sequence at the inference stage due to the different distributions of training and inference phase. That is because Seq2Seq models minimise single step errors only during training, however the entire sequence has to be generated during the inference phase which generates a discrepancy between training and inference. In this work, we propose a novel curriculum learning based strategy named Temporal Progressive Growing Sampling to effectively bridge the gap between training and inference for spatio-temporal sequence forecasting, by transforming the training process from a fully-supervised manner which utilises all available previous ground-truth values to a less-supervised manner which replaces some of the ground-truth context with generated predictions. To do that we sample the target sequence from midway outputs from intermediate models trained with bigger timescales through a carefully designed decaying strategy. Experimental results demonstrate that our proposed method better models long term dependencies and outperforms baseline approaches on two competitive datasets.